. 6
( 19)


Compressed Pairings 155

B Implicit Exponentiation of
Let be the plain ternary representation of The following
algorithm computes the of an element from its

C Solving the Curve Equation in Characteristic 3
Definition 3. The absolute trace of a field element is the linear form:

The absolute trace will always be in as one can easily check by noticing
from the above definition that for all Being surjective
and linear over it can always be represented as a (usually sparse) dual vector
in a given basis, so that one can compute in no more than
time. In a normal basis with computing amounts to
summing up all coefficients of
The coordinates of a curve point are constrained by the curve
equation to satisfy Thus one can represent a point as either
156 Michael Scott and Paulo S.L.M. Barreto

where indicates which of the two roots correspond to
or else by where indicates which of the three solutions one has to
take of the equation In characteristic 3, cubing is a linear
operation, which makes the second possibility more advantageous.
Consider the special equation for a given which is
relevant for supersingular curves in characteristic 3. This equation has a solution
if, and only if, [22, theorem 2.25]. This is the case for 1/3 of the elements
in since the trace function is linear and surjective. The complexity of solving
the cubic equation is only as we show now.
Let be defined by The kernel of is [22,
chapter 2,section 1], hence the rank of is [16, section 3.1, theorem 2].
Theorem 3. The equation over can be solved in
Proof. If is represented in standard polynomial basis, the cubic equation
reduces to a system of linear equations with coefficients in and can be solved
in no more than steps. This is achieved by first checking whether the
system has solutions, i.e. whether If so, since the rank of is
one obtains an invertible matrix A by leaving out the one row
and correspondingly one column of the matrix representation of on the given
basis. A solution of the cubic equation is then given by an arbitrary element
and by the solution of system which is obtained as
in time.
Using a normal basis to represent field elements, it is not difficult to see
that the cubic equation can be efficiently solved in time by the following
algorithm (the proof is straightforward and left as an exercise):
Cubic equation solving in normal basis:

Asymptotically Optimal Communication
for Torus-Based Cryptography

Marten van Dijk1,2 and David Woodruff1,*
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, USA
Philips Research Laboratories, Eindhoven, The Netherlands

Abstract. We introduce a compact and efficient representation of ele-
ments of the algebraic torus. This allows us to design a new discrete-
log based public-key system achieving the optimal communication rate,
partially answering the conjecture in [4]. For the product of distinct
primes, we construct efficient ElGamal signature and encryption schemes
in a subgroup of in which the number of bits exchanged is only a
fraction of that required in traditional schemes, while the se-
curity offered remains the same. We also present a Diffie-Hellman key
exchange protocol averaging only bits of communication per
key. For the cryptographically important cases of and
we transmit a 4/5 and a 24/35 fraction, respectively, of the number of
bits required in XTR [14] and recent CEILIDH [24] cryptosystems.

1 Introduction
In classical Diffie-Hellman key exchange there are two fixed system parameters
- a large prime and a generator of the multiplicative group of the field
In [10], the idea of working in finite extension fields instead of prime fields
was proposed, but no computational or communication advantages were implied.
In [26] Schnorr proposed working in a relatively small subgroup of of prime
order, improving the computational complexity of classical DH, but requiring
the same amount of communication.
In [4] it is shown how to combine these two ideas so that the number of bits
exchanged in DH key exchange is reduced by a factor of 3. Specifically, it is shown
that elements of an order subgroup G of can be efficiently represented
using bits if divides which is one third of the bits
required for elements of Since the smallest field containing G is one
can show [13] that with respect to attacks known today, the security of working
in G is the same as that of working in for large enough. In [14,15] the
XTR public key system was developed using the method of [4] together with an
efficient arithmetic to achieve both computational and communication savings.
These papers also show how to reduce communication in ElGamal encryption
and signature schemes in
* Supported by an NDSEG fellowship.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 157“178, 2004.
© International Association for Cryptologic Research 2004
158 Marten van Dijk and David Woodruff

In [4] it was conjectured that one can extend this technique to any by
working in the subgroup of of order where denotes the nth
cyclotomic polynomial. Since the degree of is where is the Euler
function, one could transmit a fraction of the number of bits needed in
classical DH, while achieving the same level of security. For the product of
the first primes, as so the savings get better and better.
In [3,24], evidence that the techniques of [4] cannot generalize to arbitrary
was presented, and in [3,24], some specific versions of the conjecture in [4] made
in [3] were shown to be false. Also in [24,25,23] it is shown that the group of
order is isomorphic to the well-studied algebraic torus [30] and
that a positive answer to the conjecture in [4] is possible if one can construct
an efficient rational parameterization of However, such a construction
is only known when is a prime power or the product of two prime powers,
although it is conjectured to exist for all [24,30]. In [24] a construction is
given for which is the basis for the CEILIDH public-key cryptosystem.
CEILIDH achieves the same communication as XTR with a few computational
In this paper we finally break the barrier” by constructing, for every
efficient ElGamal encryption and signature schemes in which require
transmitting at most a fraction of the bits required in their classical
counterparts. Further, we present an asymptotical variant of DH key exchange
in which the average number of bits exchanged per key approaches
The key property that we use is the fact that is stably rational (see [30],
section 5.1). Specifically, our enabling technique is the construction of efficiently
computable bijections and with

where — denotes direct product, and is the Möbius function1. This allows
us to bypass the torus conjecture of [24], by relaxing the problem of efficiently
representing a single symbol of to the problem of efficiently representing
a sequence of symbols in Our bijections enable us to compactly represent
elements of with bits, which for large
enough is roughly bits per element. We stress that while our key
exchange protocol achieves the optimal reduction factor asymptotically,
our encryption and signature schemes achieve this even for the encrypting or
signing of a single message.
Note that the domain and range of need not be isomorphic. Indeed, letting
denote the cyclic group of order if and then the domain
of is isomorphic to while the range is isomorphic to We show,
however, that can be decomposed into isomorphisms plus a map requiring a
table lookup. We show how to choose so that constructing and querying this
table is extremely efficient.
For an integer if if has a repeated factor, and
if is a product of distinct primes (see [11], section 16.3).

Asymptotically Optimal Communication for Torus-Based Cryptography 159

Our choice of and for fixed will also affect the security of our scheme.
We give an efficient heuristic for choosing and for the practical cases of
and where we achieve a communication reduction by factors
of 15/4 and 35/8, respectively. Further, for any we give an efficient algorithm
for choosing and with a theoretical guarantee on its performance. This latter
algorithm is primarily of theoretical interest, showing how to optimally choose
and when tends to infinity for a sufficiently large security requirement.
While our main focus and contribution is on the communication complexity,
we also calculate the amount of computation necessary to evaluate and
for general and we attempt to minimize the number of modular exponentia-
tions. We show that our representation enjoys some of the same computational
advantages of CEILIDH over XTR, including the ability to multiply elements
of directly. This allows us to come close to the non-hybrid version of El-
Gamal encryption in [24]. Indeed, in addition to constructing a hybrid ElGamal
encryption scheme, we construct a scheme in which to encrypt messages, we
form ElGamal encryptions in plus one additional encryption using a
symmetric cipher. Unfortunately, the computational complexity of our scheme
is not that practical, whereas XTR for instance, permits very efficient compu-
tations if just exponentiation is required. For we hand-optimize the
computation of and Our analysis for general shows that all of our pro-
tocols and algorithms are (theoretically) efficient in and the sizes of and
Outline: Section 2 discusses the algebraic and number-theoretic tools we use. In
section 3 we construct the bijections and Section 4 shows how to choose
system parameters to guarantee security and efficiency, giving both a practical
algorithm for and and a theoretical algorithm for general
In section 5 we discuss our cryptographic applications. Section 6 treats the
computational complexity of our bijections, and we conclude in section 7.

2 Preliminaries

2.1 Cyclotomic Polynomials and Algebraic Tori

We first state a few facts about the cyclotomic polynomials. See [19] for more
Definition 1. Let be a positive integer and let The nth cyclotomic
polynomial is defined by:

It is easy to see that the degree of is where is the Euler-totient
function. We also have:

160 Marten van Dijk and David Woodruff

and using the Möbius function

It can be shown that the cyclotomic polynomials are irreducible polynomials
over with integer coefficients. For a prime power, let denote the finite
we define the algebraic torus2
field with elements. For integers

2.2 Number Theory
The following is the celebrated prime number theorem (see [11], chapter 22):
Theorem 1. For large enough the number of primes less than or equal to
We also need the fact that for any and for the prod-
uct of the first distinct primes, We use the following
density theorem in our analysis:
Theorem 2. (Chebotarev [5,16]) For any integer and any the
density of primes (among the set of all primes) with is

3 The Bijection
Let be a prime power, a positive integer, the multiplicative group of
the field of order and the algebraic torus over
For an integer let The goal of this section is to construct
efficiently computable bijections and where

Our strategy is to first find efficient bijections and where

Note that in general and need not be isomorphic. Let de-
note the cyclic group of order We first need a few lemmas. The following is
an immediate consequence of the structure theorem of abelian groups, but for
completeness and to exhibit the efficient isomorphisms, we include it:
Lemma 1. Suppose for pairwise relatively prime positive inte-
gers Then there exist efficiently computable isomorphisms
Technically, just refers to the points of the algebraic torus rather than
the torus itself (see [24,30]).

Asymptotically Optimal Communication for Torus-Based Cryptography 161

Proof. For put Since the are pairwise relatively prime,
so there exist integers for which For
define Since maps elements of
to elements in the product group For define
where multiplication occurs in
The claim is that and are inverse isomorphisms between and
For we have Similarly, for
we have

Now, if so in this case Also,
for an integer so Hence, which shows
and are inverses. Observe that
and similarly
which shows that the maps
are isomorphisms. Computing and just requires multiplication and exponen-
tiation, which can be made efficient by repeated squaring.

Let be the smallest positive integer for which
for all with and

Lemma 2. For let Then
Furthermore, the isomorphisms are efficiently computable.

Proof. By lemma 1 it suffices to show (1) (2) for all
and (3) for all
Using the fact that the following establishes (1):

where the second equality follows from the definition of U. For (2), observe that

since if prime by minimality of U there exist for which
so if then a contra-
diction. To see (3), note that by the
definition of U.
162 Marten van Dijk and David Woodruff

We use the following bijections with complexity proportional to U, which we
later show to be negligible for an appropriate choice of
Lemma 3. For let There exist bijections between
and requiring time to evaluate and
space for any
Proof. Using the definition of U,

so there exists a bijection between the two groups. Choose a generator of
and generators of For each make a table entry mapping
to a unique tuple Since the sum of the divisors of is less than
for any ([H], section 18.3), the table consumes
space. We sort the entries in both directions so that both bijections are efficient.
Evaluations of either bijection can then be performed with a binary search in
We need another auxiliary map:
Lemma 4. Let and be as in the previous two lemmas. Then,
Furthermore, the isomorphisms are efficiently com-
Proof. It suffices to show for any and that this isomor-
phism is efficiently computable. Note that
since by the definition of U. By the same observa-
tion, Lemma 1 establishes the claim.
The following is immediate from the previous 3 lemmas:
Lemma 5. Assuming the maps of lemma 3 are efficient, there exist efficiently
computable bijections and where
We now have the bijection claimed at the beginning:
Theorem 3. Assuming the maps of lemma 3 are efficient, there exist efficiently
computable bijections and where

Proof. Lemma 5 gives efficient bijections between
and and also between
and By permuting coordinates, the theorem will fol-
low if we show the multiset equality

Asymptotically Optimal Communication for Torus-Based Cryptography 163

From section 2, in the polyno-
mial ring Decomposing this equation into irreducible polynomials, we have
and since is a unique
factorization domain, the irreducible polynomials on the left must be the same
as those on the right. This gives the desired multiset equality.

4 Parameter Selection
The two constraints on choosing and for fixed are security and efficiency
constraints, the latter measured by the size of the tables needed in our
bijections. We first discuss the role of security in parameter selection:

4.1 Security Measures
Our schemes derive their security from the same assumptions of XTR and
CEILIDH. That is, if there is a successful attack against one of our crypto-
graphic primitives, then there is a successful attack against the corresponding
primitive in the underlying group we use, which we assume is impossible. Let
be a multiplicative group of order with generator The security of
our applications relies on the hardness of both the Computational Diffie-Hellman
problem (CDH) and the Decisional Diffie-Hellman problem (DDH) in The
former is the problem of computing given and and the latter is that
of distinguishing triples of the form from for random
and The hardness of both of these problems implies the hardness of the dis-
crete logarithm problem (DL) in find given Due to the Pohlig-Hellman
algorithm [21], the DL problem in can be reduced to the DL problem in all
prime order subgroups of so we might as well assume that is prime.
There are two known approaches to solving the DL problem in [1,7,9,13,
20,27,28], one which attacks the full multiplicative group of itself using the
Discrete Logarithm variant of the Number Field Sieve, and one which concen-
trates directly on the subgroup using Pollard™s Birthday Paradox based rho
method [22]. Let be the smallest divisor of for which can be embedded in
The heuristic expected running time of the first attack is
where If is small, e.g. then
the constant 1.923 can be replaced with 1.53. The second attack, due to Pollard,
takes operations in
Hence we see that the difficulty of solving the DL problem in depends
on both the size of the minimal surrounding subfield and on the size of its
prime order If is itself the minimal surrounding subfield, as is the case
if we choose with then for sufficiently large the DL, CDH,
and DDH problems in are widely believed to be just as hard as solving
their classical counterparts w.r.t. an element of prime order in the prime
field of cardinality [14]. As mentioned in [14], when and
solving the DL problem in is generally believed to be harder
than factoring an 1024-bit RSA modulus provided is not too small.
164 Marten van Dijk and David Woodruff

Practical Algorithm for and

Based on our security discussion, it is shown in [4] that, assuming an RSA key
length between 1024 and 2048 bits gives adequate security, for we should
choose to be a prime between 35 and 70 bits long, and for we should
choose to be a prime between 5 and 10 bits long. Note that for the next value
of for which we achieve a communication savings, the
field size will have to be at least 2310 bits, so any setting of already exceeds
the 2048 bits needed for adequate security.
In [13] it is shown how to quickly find a and an meeting these requirements
for fixed The algorithm is heuristic, and involves choosing random of a
certain size and checking if contains a sufficiently large prime factor
by trial division with the primes up to roughly On a 166MHz processor,
for it was shown that it takes 12 seconds to find an of size between
214 and 251 bits for of size 32 bits. Note that for we actually need
to be slightly smaller, as claimed in the previous paragraph. This way we
can achieve the largest efficiency gain for a fixed security guarantee. Using the
algorithm of [13], fixing the size of to be approximately 161 bits and searching
for an appropriate took three hours instead of the 12 seconds needed previously.
However, there are three reasons we do not consider this to be problematic. First,
CPU speeds are easily ten times as fast these days. Second, we don™t need to fix
the size of to be exactly 161 bits; we just need to find an of approximately
this size. And third, finding the system parameters is a one-time cost and can
be done offline, or even by a trusted third party.
From the efficiency analysis in the next section and lemma 6, one can show
that the table size resulting from choosing at random subject to the
above constraints is likely to be small with good probability. Hence, this heuristic
algorithm is likely to find a and an so that both security and efficiency
constraints are met in a reasonable amount of time.

4.3 Theoretical Algorithm for General
with Probabilistic Guarantees

In this section we use properties of the density of primes to design a parame-
ter selection algorithm and rigorously analyze its performance. Unfortunately,
since the factorization of for random primes does not seem to be well-
understood, we are forced to choose which with respect to attacks known
today, doesn™t allow for choosing the optimal and for and
if we just want 2048 bit RSA security. A straightforward calculation shows that
for the following algorithm gives us the largest efficiency gain for a fixed
security guarantee if and only if is at least 558 bits. Hence, we should view
the algorithm as theoretical in nature, and apply the heuristic of the previous
section for small
Let be a positive integer tending to infinity and let be the product of the
first primes. We want to choose so that:
Asymptotically Optimal Communication for Torus-Based Cryptography 165

is sufficiently large.
2. There exists a large prime factor of
3. is small.

We say an integer is squarefree if it contains no repeated factors. The selection
algorithm is as follows:
Parameter Selection Algorithm

1. Let S be the subset of the first primes for which is squarefree, and
2. Find an R-bit prime for which and find a of order
3. Find a Q-bit prime for some integer such that:
(a) For all , where denotes the order of
(b) For all
4. Find a generator of the subgroup of order of Output and

We first claim that if the PSA algorithm terminates, then and meet the
aforementioned properties. By setting Q large enough, the first property holds.
We have for some integer and since
Hence by choosing R sufficiently large, the second
property holds. To show is small, we need the following lemma:
Lemma 6. Let be a prime and an integer such that Then if and
only if In case of the latter, if and only if

Proof. By minimality of U, if and only if there exist divisors of
for which Fix two such divisors and let
and suppose Since
Since we
have so or Similarly, But then
contradicting our choice of Hence, which means
Suppose there is another divisor of for which Then by the
above, and and since contradicting the fact
that is squarefree. This means that is the unique pair of divisors for which
Since and since
Put and Then is the smallest positive
integer for which so Also,
Hence if
then and Conversely, if then for these

We have shown if and only if The above shows that
if then and conversely if

166 Marten van Dijk and David Woodruff

Remark 1. Note that since on the one hand we have
and on the other hand we have
Hence if then
so it follows that
The following lemma provides tight asymptotic bounds on
Lemma 7. If the PSA algorithm terminates, where is
Artin™s constant.
Proof. By the previous lemma, if then so Now if
is not squarefree, so by step 3b, so On the other
hand, if is a product of distinct primes in so
and hence Combining this with the remark above, step 3a of the PSA
algorithm, and the previous lemma, we conclude that U is exactly the square of
the product of primes in S and that the PSA algorithm chooses so that U is
To obtain the bound on U it suffices to show that the density of primes
for which is squarefree is C, where C is Artin™s constant [8]. The bound
will then hold for large enough For a prime is not squarefree if and
only if for a prime By the inclusion-exclusion principle, the
multiplicativity of and theorem 2, the density of primes for which
is squarefree is:

By theorem 1, for sufficiently large and where the
approximation is up to low order terms. Hence,

Finally, we show the PSA algorithm terminates quickly in expectation:
Efficiency Analysis: By theorem 1, and Determining
S and T in step 1 can therefore be done by trial division in time. We
can perform step 2 by choosing a random R-bit number efficiently checking
if is prime, and checking if This requires an expected
samples To find we choose a random set
and check that for all proper divisors of In expectation,
after O(logR) trials one such will be a generator of for which setting
gives with Conversely, if for all proper divisors of
we have then Since the number of proper divisors
of is for any ([11], section 18.1), the check in step 2 is efficient.
For step 3, for each we can find an element with
by simply trying each of the elements of until we succeed.
Asymptotically Optimal Communication for Torus-Based Cryptography 167

We then choose a random integer for which is a Q-bit number and
efficiently check if is prime. If so, then for each we can compute in
time, then check if by repeated squaring. For each
we check if
The claim is that the number of random samples needed in step 3 is only
Using the fact that the density of primes amongst integers of the
form is an integer for which is prime can be found
with O(Q) samples in expectation. By independence, the density of primes
which are for every is where C is
Artin™s constant. Fix any By theorem 2, for all but a negligible fraction
of primes for a generator of Since is a generator,
if and only if is a multiple of and there are only
such multiples. By theorem 2, it is equally likely that for
any so the density of primes for which is at least
By independence, the density of for which for all
is at least Applying independence
one last time, we conclude that can be found with an expected
Finally, step 4 can be implemented by choosing a random and making
sure that The number of generators of is which is
so the expected number of samples needed is

5 Cryptographic Applications
Let be the product of the first primes, and let and be public param-
eters generated as in section 4. Define and
and observe that From section 3,
we have an efficiently computable bijection and its inverse with

From the proof of theorem 3, we see that there are a number of choices
for depending on which coordinate permutation is chosen. While this choice
does not affect the communication of our protocols or the size of our encryp-
tions/signatures, it can affect the computational costs. In section 6 we choose a
specific permutation and analyze the computational requirements for
We will think of and as efficiently computatble maps between
and by fixing polynomial representations of with An ele-
ment of is then just a list of coefficients with respect to these
polynomials, and can be treated as an element of Let
denote the coordinates of an element corresponding
to the coefficients of with respect to the irreducible polynomial for Our
map may not be well-defined because we may have
168 Marten van Dijk and David Woodruff

However, if is chosen randomly, the probability that some coordinate
of is zero is less than for any which is negligible.
The same is true of a randomly chosen element of Hence, if we apply
and to random and and
are well-defined with overwhelming probability.
It is possible to modify and if one wants more than a probabilistic
guarantee. Define and We
can efficiently extend to the well-defined map

where for each and for each with
if we replace with 1, obtaining a new string
and define where for all if and only
if for the jth divisor Note that the inverse of
restricted to the image of is also well-defined. Similarly, letting denote
we can extend to a well-defined map
and construct
The next sections describe our cryptographic applications. For simplicity,
in our security analyses we assume and are actually bijections between
and although it should be understood that our pro-
tocols can be slightly modified so that or can be used without affecting
the security. The only application where this is not immediately obvious is the
non-hybrid ElGamal encryption, but step 3 of that protocol can be modified to
additionally encrypt the “extra bits” from using, say, the same key used in
step 3.

5.1 Diffie-Hellman Key Agreement
For Alice and Bob to agree on a sequence of secret keys they engage in
the following protocol:
1. Alice and Bob choose random and in respectively,
and treat them as elements of
2. For to
(a) Alice selects a random integer with sets com-
putes and transmits to Bob.
(b) Bob selects a random integer with sets computes
and transmits to Alice.
3. Alice sends to Bob and Bob sends to Alice.
4. For to 1.
(a) Alice computes and sets
(b) Bob computes and sets

Asymptotically Optimal Communication for Torus-Based Cryptography 169

The number of bits sent from Alice to Bob (and from Bob to Alice) is about
so the rate approaches the optimal bits per key
as gets large. This beats all known schemes for In particular, for
our scheme requires only bits per shared key while generalizing
the scheme in section 4.11 of [14] to gives a scheme requiring bits
per key exchange. The scheme in [24] would also achieve our rate, but needs an
unproven conjecture concerning the rationality of
Observe that and are random, and since is a bijection, the
last coordinates of are of a random element in
Hence the probability that some coordinate of is zero is even less than that for
a random element in which is negligible. One can then verify that every
application of or is on a random element. It follows from the foregoing
discussion and the union bound that the probability of either Alice or Bob ever
attempting to apply or on an element outside of the domain is negligible.
For deterministic guarantees, one can replace and with and negligibly
changing the rate to for any Given the overwhelming
probability guarantees for and this does not seem necessary.
Security: An eavesdropper obtains and Since
and are efficient bijections, this is equivalent to obtaining
and Since and are random, determining a shared secret
is equivalent to solving the CDH problem in given

5.2 ElGamal Signature Schemes
Suppose the message M to be signed is at least bits long. If
this is not the case, one can wait until there are messages to be signed
for which then define M to be the concatenation
and sign M. For a random let be Alice™s
private key and her public key. Let be a cryptographic
hash function. We have the following generalized ElGamal signature scheme (see
p.458 of [18] for background):
Signature Generation (M):
1. Alice selects a random secret integer and computes
2. Alice then computes
3. Alice expresses as computes
and outputs (S, T) as her signature.
Signature Verification ( M, S, T ):
1. Bob computes and constructs M and from R and S.
2. Bob accepts the signature if and only if
The communication of this scheme is at the optimal for
ElGamal signature schemes, even for one message (as long as M is large enough).
170 Marten van Dijk and David Woodruff

This beats the communication of the scheme in [4,17]
when in particular for the practical values and Our
communication is the same as that in [24], but we do not rely on any conjectures.
Note that our map may fail since M need not be random. One can avoid
this by excluding the negligibly few M for which is not defined (as in RSA
or the schemes of [24]), or one can replace with as defined above, and
communicate an additional bits of overhead. Alternatively Alice can use
a pseudorandom generator to randomize M and communicate the small seed
used to Bob, requiring even less communication than the already asymptotically
negligible bits.
We note that a simple modification of our protocol, making it similar in
spirit to our key exchange protocol, can allow Alice to sign each individually,
allowing for incremental verification.
Security: In this scheme the verifier obtains (S,T), which is equivalent to ob-
taining M, and Thus, the security of this scheme reduces to the security of
the generalized ElGamal signature scheme in

5.3 ElGamal Encryption
We present two flavors of ElGamal encryption. The first is a hybrid scheme with
shorter encryptions than the one in [14], while the second is essentially a non-
hybrid analogue of ElGamal in In the second, to encrypt a sequence of
messages, encryptions are created and of them are performed directly
in The first scheme achieves optimal communication, while the second
is asymptotically optimal.

Hybrid ElGamal. For random let be Bob™s private key and
his public key. Suppose Alice wants to encrypt the message
with Bob™s public key. Let E be an agreed upon symmetric encryption scheme
with domain We have the following protocol:
Encryption (M):
1. Alice selects a random secret integer and computes
2. From B Alice computes
3. From Alice derives a key Q for E and computes the encryption of M,
E(M), under key Q. Alice writes E(M) as
4. Alice computes and outputs her encryption (S,T).
Decryption (S,T):
1. Bob computes
2. From and Bob computes
3. From Bob derives Q and decrypts E(M) = (R, S) to obtain and output
The communication of this scheme is at the optimal bits
for hybrid ElGamal encryption. As in our protocol for signature schemes, we

Asymptotically Optimal Communication for Torus-Based Cryptography 171

achieve this rate even for a single message. This beats the
bit scheme in [14] for
It is unlikely that or is applied to an element with any zero coordinates
since is random and E(M) is likely to “look random” in practice, so is
likely to be a random element of for which it is extremely unlikely that
any coordinates are zero. An exact analysis, though, depends on one™s choice
of E. As in our protocol for signature schemes, one can randomize E(M) to
decrease the error probability or replace with for a deterministic guarantee
at the cost of a few bits of communication.
Security: An adversary learns (S,T), which is equivalent to learning and
E(M). Assuming the CDH problem is hard in the security of this scheme
is just that of the symmetric scheme E, assuming the key Q to E is chosen
reasonably from To derive Q from one can extract bits that are hard to
compute by an eavesdropper, see [2].

Almost Non-hybrid ElGamal. In the following, Alice will encrypt a sequence
of messages each in She will form encryptions,
of which are encryptions in and one requiring the use of an agreed upon
symmetric encryption scheme E.
In the encryption phase of our scheme we will apply . to for some
For semantic security, for all it must hold that
which in general may be strictly contained in
For this we adopt the technique in section 3.7 of [25]. Namely, by reserving a
few bits of each to be “redundancy bits”, if has small enough index in
then for any R we need only try a few random settings of these bits until
which we can test by checking
if In the following protocol description we ignore this issue and assume
whenever is applied, its image is in
For random let be Bob™s private keys and
be his public keys. We have the following scheme:
Encryption (M):

1. Alice chooses a random
2. For to
(a) Alice computes
(b) Alice chooses a random secret integer and forms the
3. Alice uses the hybrid ElGamal encryption scheme with symmetric cipher
E and public key to encrypt as with and

4. For to 1,
(a) Alice computes
(b) Alice computes
5. Alice outputs S as her encryption of
172 Marten van Dijk and David Woodruff


1. For to
(a) Bob computes
(b) Bob computes
(c) Bob computes
2. Bob uses and S, together with in the decryption procedure of the
hybrid ElGamal scheme to recover
3. For to 1, Bob computes
4. Bob outputs
The communication of this scheme is bits.
Hence, as grows, the rate of this scheme approaches which is
optimal for ElGamal type encryption.
Note that the need not be random, and consequently
may not be well-defined. Choosing random will increase the chances that
is always defined. Alternatively, one can use the ideas of section
5.2 to randomize or one can use instead of Again, since
needn™t be random even if E is semantically secure, one may want to
use in place of This adds a negligible amount to the communication, and as
stated earlier, encrypting the extra bits of can be done in step 3.
Security: An adversary learns S, which is equivalent
to learning where is the semantically secure
hybrid encryption scheme. Assuming DDH is hard in is a semantically
secure encryption of for all The security of the scheme then follows
from the fact that the keypairs and of are independent.

6 Computational Complexity
In this section we present efficient algorithms for computing and analyze
their complexity, and suggest an alternative way of improving computational
costs with slightly more communication. Each of these is described in turn.

6.1 Algorithm
Before describing and we need some notation:

For let be the smallest integer for which
for all with and
For we define and
Generalizing section 3, we can find and s.t.
Further, we can find and for which
Let for
be a bijective mapping and define
Asymptotically Optimal Communication for Torus-Based Cryptography 173

A naive implementation of consists of the following steps:
1. We first use an isomorphism

2. By using a table lookup we map
and we use an isomorphism
By the structure theorem of Abelian groups there is an isomorphism
for each with and
3. By using a permutation we obtain a mapping

4. By the structure theorem of Abelian groups there is, for each with
and an isomorphism By using a
table lookup we map and we use
an isomorphism
5. In the last step we use an isomorphism

Each of the isomorphisms are defined by taking simultaneous exponentiations.
An improved implementation combines different isomorphisms in a single simul-
taneous exponentiation. Each table lookup followed by an exponentiation can
be implemented as a single table lookup. This reduces the number of exponen-
tiations and multiplications.
Computation of for

1. For
(a) Compute and map it to by using
a table look up.
(b) Compute
2. Compute
3. For
(a) Map to by using a table look up.
(b) Compute
which is in
4. Multiply with
The ideas in section 3 can be used to show the algorithm above is well-defined.
The improved computation of is similar, where we make sure to use the
inverse of the coordinate permutation used in
174 Marten van Dijk and David Woodruff

6.2 Complexity
For background on efficient computations in fields and subgroups, see [6,12,29].
Consider the algorithm for In step 1, for we perform
exponentiations in Notice that, in step 1b we do not need to compute
since it can be combined with the table lookup in step 1a (there is an entry
in the table corresponding to for every Step 2 costs 1 exponentiation in

For or we precompute This
costs multiplications in By using the results of the precomputation, an
exponentiation for some in costs on average multiplications
in (the bit length of the exponent is and roughly half the time a
bit is equal to 1). Each multiplication in costs multiplications in
Summarizing, steps 1 and 2 cost about

multiplications in
In step 3, for we need to perform, for each with
one exponentiation in We do not need to compute which
can be combined with the table lookup in step 1a.
The cost of step 3, measured in multiplications in the base field is on
average approximately Since
defines a permutation, this expression is equal to

The total cost is multiplications in where we neglect the cost of table
lookups, addition, and multiplication modulo an integer. Since
we have
since the sum of divisors of is for any This proves

The same techniques show requires multiplications in

6.3 Efficiency Improvements
To improve the efficiency we may use exponentiation algorithms for fixed expo-
nents using vector addition chains. Also, we may group several exponentiations
of together into one exponentiation by appropriately choosing the bijections
If is not too large, we may use simultaneous exponentiation to speed up
the computations. Full simultaneous exponentiations in every step requires a
precomputation of multiplications. We may optimize by using simultaneous
Asymptotically Optimal Communication for Torus-Based Cryptography 175

exponentiation to compute intermediate results which we multiply together to
compute the full exponentiation. Finally, we may combine the exponentiations
required in our applications with the evaluation of
Notice that is much more efficient if, for with
Then, for with and Table
lookups can be avoided. Therefore each for with can
be computed by a single simultaneous exponentiation of
with fixed exponents in step 3. To make use of this, we define
a new map which maps into and
the table entries of This increases the communication
cost by

bits which in practice is much less than So at the cost of a small increase
in communication we improve the computational efficiency.
Computation of and

1. For compute
2. Compute

for with Multiply with
4. Compute
5. Compute

for with where
For and
We define

We use
and [31]. In step 1, we
compute and using single exponentiations by using the square
and multiply method [18, p. 614]. This costs in total
multiplications in
176 Marten van Dijk and David Woodruff

In step 2, is computed as a simultaneous exponentiation [18, p. 618]in
In a precomputation we
possible sets the product
compute for each of the
The whole precomputation costs at most In
multiplications in
the exponents of etc., have bit lengths
the computation of
etc. This means that in the second half of the simultaneous
exponentiation (the last bits of the exponents) we only need to
square or square-and-multiply with So the average costs in the second
half of the simultaneous multiplication is equal to multiplications
in The simultaneous exponentiation corresponding to the bits ranging from
position to involves square or square and multiply with
or This costs on average multiplications (5 is the difference
between 15 and 10, on average we need 1 multiplication in 1 out of 4 cases
and 2 multiplications in 3 out of 4 cases). Notice that we treat squaring as a
single multiplication in this excersise. Continuing this argument we need in total

multiplications in comes from preprocessing).
The outputs and are single multiplications in and
respectively costing a total of mul-
tiplications. Concluding, the computation of costs approximately
multiplications in A single exponentiation in costs
multiplications. Hence, costs about 2.32 exponentiations in
In the implementation of we compute as a single exponentiation in
costing multiplications. In step 5, is a simul-
taneous exponentiation in and (and a table look up for the exponentiation
in This costs multiplications.
Similarly, costs and costs
multiplications. We compute as
a single exponentiation in costing multipli-
cations. Concluding, the computation of costs approximately
multiplications, which is equivalent to 2.67 exponentiations in

7 Conclusions and Open Problems

Our fundamental contribution is a compact and efficient representation of ele-
ments of namely, the construction of bijections and of section 3.
This allows us to construct ElGamal signature and encryption schemes meeting
the optimal rate of communication, as well as a secret key exchange protocol
meeting this rate asymptotically. If the torus conjecture of [24] is proven, the
schemes in that paper will also achieve this rate, and moreover, their scheme for
DH key exchange will meet the optimal rate even for a single key exchanged.
Hence, resolving their conjecture is an important problem. Another important
question is whether the computational cost of our schemes can be reduced to
a more practical level. Finally, our representation of may have other
Asymptotically Optimal Communication for Torus-Based Cryptography 177

1. L. M. Adelman, J. DeMarrais, A Subexponential Algorithm for Discrete Loga-
rithms over All Finite Fields, in Advances in Cryptology “ Crypto ™93, LNCS
773, Springer-Verlag 1994, 147-158.
2. D. Boneh and R. Venkatesan, Rounding in lattices and its cryptographic applica-
tions, Proc. 8-rd Annual ACM-SIAM Symposium on Discrete Algorithms, ACM,
NY, 1997, 675“ 681.
3. W. Bosma, J. Hutton, and E. R. Verheul, Looking Beyond XTR, in Advances in
Cryptology “ Asiacrypt ™02, LNCS 2501, Springer, Berlin, 2002, 46 - 63.
4. A. E. Brouwer, R. Pellikaan, and E. R. Verheul, Doing More with Fewer Bits, In
Advances of Cryptology “ Asiacrypt ™99, LNCS 1716, Springer, 321-332.
5. N. G. Chebotarev, Die Bestimmung der Dichtigkeit einer Menge von Primzahlen,
welche zu einer gegebenen Substitutionsklasse gehören. Math. Ann. 95, 191-228
6. H. Cohen and A. K. Lenstra, Supplement to Implementation of a New Primality
Test, Mathematics of Computation, volume 48, number 177, 1987.
7. D. Coppersmith, Fast Evaluation of Logarithms in Fields of Characteristic Two,
IEEE Trans. Inform. Theory 30 (1984), 587-594.
8. S. R. Finch, Artin™s Constant, 2.4 in Mathematical Constants, Cambridge, Eng-
land: Cambridge University Press (2003), 104-110.
9. D. Gordon, Discrete Logarithms in Using the Number Field Sieve, SIAM J.
Discrete Math. 6 (1993), 312-323.
10. T. ElGamal, A Public Key Cryptosystem and a Signature Scheme Based on Dis-
crete Logarithms, IEEE Transactions on Information Theory 31(4), 1985, 469-472.
11. G.H. Hardy and E.M. Wright, An Introduction to the Theory of Numbers, 5th
edition, Oxford University Press, 1979.
12. A. Karatsuba and Y. Ofman. Multiplication of Multidigit Numbers on Automata,
Soviet Physics Doklady, volume 7, 1963, 595-596.
13. A. K. Lenstra, Using Cyclotomic Polynomials to Construct Efficient Discrete Log-
arithm Cryptosystems over Finite Fields, Proceedings of ACISP 97, LNCS 1270,
Springer-Verlag 1997, 127-138.
14. A. K. Lenstra and E. R. Verheul, The XTR Public Key System, In Advances of
Cryptology “ Crypto 2000, LNCS 1880, Springer, 1-19.
15. A. K. Lenstra and E. R. Verheul, An Overview of the XTR Public Key System,
in Public-key cryptography and computational number theory (Warsaw, 2000), de
Gruyter, Berlin, 2001, 151-180.
16. H. W. Lenstra, The Chebotarev Density Theorem, URL:
17. Seongan Lim, Seungjoo Kim, Ikkwon Yie, Jaemoon Kim, Hongsub Lee, XTR Ex-
tended to Selected Areas in Cryptography, 8th Annual International
Workshop, SAC 2001, 301-312, Springer Verlag, 2001.
18. A. J. Menezes, P. C. van Oorschot, S. A. Vanstone, Handbook of Applied Cryp-
tography, CRC Press, Boca Raton, FL, 1997.
19. T. Nagell, “The Cyclotomic Polynomials” and “The Prime Divisors of the Cy-
clotomic Polynomial”, 46 and 48 in Introduction to Number Theory. New York:
Wiley, 158-160 and 164-168, 1951.
20. A. Odlyzko, Discrete Logarithms: The past and the future, Designs, Codes and
Cryptography, 19 (2000), 129-145.

178 Marten van Dijk and David Woodruff

21. S. C. Pohlig, M. E. Hellman, An Improved Algorithm for Computing Logarithms
over and its Cryptographic Significance, IEEE Trans. on IT, 24 (1978),
22. J. M. Pollard, Monte Carlo methods for index computation Math. Comp.,
32 (1978), 918-924.
23. K. Rubin and A. Silverberg, Algebraic tori in cryptography, to appear in High
Primes and Misdemeanours: lectures in honour of the 60th birthday of Hugh Cowie
Williams, Fields Institute Communications Series, American Mathematical Soci-
ety, Providence, RI (2004).
24. K. Rubin and A. Silverberg, Torus-Based Cryptography, In Advances of Cryptology
“ Crypto 2003, LNCS 2729, Springer, 349-365.
25. K. Rubin and A. Silverberg, Using primitive subgroups to do more with fewer bits,
In Algorithmic Number Theory (ANTS VI), Lecture Notes in Computer Science
3076 (2004), Springer, 18-41.
26. C. P. Schnorr, Efficient Signature Generation by Smart Cards, Journal of Cryptol-
ogy, 4 (1991), 161-174.
27. O. Schirokauer, Discrete Logarithms and Local Units, Phil. Trans. R. Soc. Lond. A
345, 1993, 409-423.
28. O. Schirokauer, D. Weber, Th. F. Denny, Discrete Logarithms: the effectiveness
of the index calculus method, Proceedings ANTS II, LNCS 1122, Springer-Verlag
29. M. Stam, Speeding up Subgroup Cryptosystems, PhD Thesis, Eindhoven University
of Technology, 2003.
30. V. Voskresenskii, Algebraic Groups and Their Birational Invariants, Translations
of Mathematical Monographs 179, American Mathematical Society, Providence,
RI, 1998.
31. A. Weimerskirch and C. Paar, Generalizations of the Karatsuba Algorithm for Ef-
ficient Implementations, URL:
http://www.crypto.ruhr-uni-bochum.de/Publikationen/, 2003.

How to Compress Rabin Ciphertexts
and Signatures (and More)

Craig Gentry
DoCoMo USA Labs

Abstract. Ordinarily, RSA and Rabin ciphertexts and signatures are
log N bits, where N is a composite modulus; here, we describe how to
“compress” Rabin ciphertexts and signatures (among other things) down
to about (2/3) log N bits, while maintaining a tight provable reduction
from factoring in the random oracle model. The computational overhead
of our compression algorithms is small. We also improve upon Coron™s re-
sults regarding partial-domain-hash signature schemes, reducing by over
300 bits the hash output size necessary to prove adequate security.

1 Introduction
The hardness of factoring is one of the most fundamental and frequently used
assumptions of public-key cryptography; yet cryptosystems that rely on the fac-
toring assumption have relatively poor performance in terms of bandwidth. For
example, RSA and Rabin ciphertexts and signatures are typically at least as
many bits as the composite modulus N, while recent advances in hardware-based
approaches to factoring (e.g., [32]) suggest that N must be more than 1024 bits
for strong security. So, factoring-based cryptosystems often do not compare fa-
vorably with cryptosystems based on alternative hard problems “ e.g., ECC for
encryption or DSA for signatures.
Bandwidth consumption is important, in part because fundamental limita-
tions of wireless technology put bandwidth at a premium. For example, Barr
and [2] note that wireless transmission of a single bit can cost more
than 1000 times as much energy as a 32-bit computation. Since battery efficiency
is growing relatively slowly, energy consumption (particularly through wireless
transmission) may become a significant bottleneck.
Moreover, signal interference places physical limits on how much data can
be transmitted wirelessly in a given region. This was not a problem in wired
networks. These limitations are compounded by the lossiness of wireless channels,
which necessitates additional bandwidth in the form of forward error correction
(FEC). FEC is particularly important for cryptographic transmissions, where
partial recovery of a ciphertext or digital signature is typically useless.
These considerations make compression algorithms very attractive. In fact, in
recent years, substantial progress has been made in constructing “compressed”
cryptosystems. For example, XTR [22] and CEILIDH [30] both use “compact

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 179“200, 2004.
© International Association for Cryptologic Research 2004
180 Craig Gentry

representations” of certain elements to achieve a bandwidth savings. There are
also a variety of hybrid cryptosystems, such as signcryption and aggregate sig-
nature schemes, in which multiple cryptographic functionalities are somehow
represented by a single, relatively short string. However, although such hybrid
cryptosystems exist for RSA and Rabin, none of them breaks the “(log N)-bit
OUR DESIGN GOALS. In light of these considerations, we would like to construct
a compression algorithm that is broadly applicable to factoring-based schemes,
such as RSA and Rabin. Ideally, the compression algorithm should allow RSA
and Rabin ciphertexts and signatures to be substantially less than log N bits
without sacrificing any security “ i.e., while still using (and retaining the secu-
rity of) a (log N)-bit modulus. Moreover, the compression algorithm should add
minimal computational overhead. If the compression algorithm requires addi-
tional computation, this computation should not require use of the secret key,
so that it can be performed (more quickly) outside of a “secure environment,”
such as a smart card.
OUR RESULTS. We essentially achieve our design goals, except that our tech-
niques work only for Rabin-type cryptosystems, not for RSA. Along the way, we
also substantially improve upon Coron™s results on partial-domain-hash Rabin
signature schemes (Rabin-PDH).
Coron [16] proved the security of a variant of the Rabin signing scheme
(Rabin-PDH) in which the hash function that is used to hash the message out-
puts strings of length It turns out that this has a large effect
in practice; if the simulator in the security proof wishes to generate a distribution
of signatures whose statistical distance from uniform is less than Coron™s
method requires that the hash output length be at least at We
provide a perfectly uniform drawing algorithm that reduces the necessary hash
output length to only moreover, our security proof is tighter.
Our main result, however, is a compression algorithm that allows a 33%
reduction in the bit-length of Rabin signatures and ciphertexts, without any
sacrifice in security. (Notice that Coron™s result is not a compression algorithm;
although the hash output length of Coron™s Rabin-PDH scheme may be less
than log N bits, the Rabin-PDH signature itself, which is essentially a modular
square root of the hash output, is a (log N)-bit value.) For our improved version
of Rabin-PDH signatures, the “entropy” of the hash output is just over
bits; thus, it is theoretically possible that the signature could also be expressed in
about In fact, up to the loss of a few bits, this is precisely what we
achieve: a Rabin-PDH signature, with a tight reduction from
factoring N.
Our lossless compression algorithm also works for Rabin encryption, but in
reverse. A plaintext is “decompressed” by mapping it to a (log N)-
bit number that has a modular square. This modular square
is a “compressed” Rabin ciphertext. Numerous other cryptosystems also involve
computing square roots modulo a composite modulus N, including Fiat-Shamir,
Cocks™s identity-based encryption scheme, as well as various schemes enabling
How to Compress Rabin Ciphertexts and Signatures (and More) 181

ring signatures, signcryption, and so on. Our techniques enable a similar 33%
bandwidth reduction for these schemes.
RELATED WORK. Like Coron™s work, our techniques build upon Brigitte Vall©e™s
elegant analysis of the distribution, in of integers in
for “ i.e., integers with modular
squares in a “narrow” interval. We provide a self-contained discussion of her
results in section 3.
Some previous work has been done on compressing Rabin and low-exponent
RSA signatures “ in particular, Bernstein [7] mentions that one can simply re-
move the least significant bits of any regular Rabin or RSA signature,
and the verifier can use Coppersmith™s method [17] to recover those bits. Ble-
ichenbacher [8] describes an improvement: the signer can use continued fractions
to express the signature as where is about bits and
is about bits, and send as the signature. The verifier checks that
is an power (namely over The drawback of
these methods, though they arguably reduce Rabin signature length to
bits, is that they do not allow message recovery; the verifier needs before
verifying, which effectively adds to the signature length. These methods also do
not appear to be very broadly applicable; e.g., they do not appear to lead to
low-bit-length encryption, signcryption and aggregate signature schemes.
As mentioned above, Coron [16] uses a “compressed” output space for the
hash function in a Rabin signature scheme, but the partial-domain hash signa-
tures themselves are still log N bits.
ORGANIZATION OF THE PAPER. This paper is organized as follows. After noting
some preliminaries in Section 2, we describe Vall©e™s distributional observations
and her “quasi-uniform” drawing algorithm in section 3. In section 4, we describe
our perfectly uniform drawing algorithm, and our improvement upon Coron™s
results regarding Rabin-PDH. We describe our compression algorithm in section
5, after which we describe compressed Rabin encryption and signature schemes
in section 6. Finally, in Section 7, we mention other cryptosystems “ such as
signcryption, aggregate signature and ring signature schemes “ for which our
compression algorithm allows a 33% bandwidth reduction.

2 Preliminaries
We gather some mathematical notation here for convenience. Let {0,1}* denote
the set of all bit strings, and let denote the set of all bit-strings of length
For a real number denotes the ceiling of that is, the smallest integer
value greater than or equal to Similarly, denotes the floor of that is,
the largest integer value less than or equal to Finally, denotes the closest
integer to Let the symbol denote concatenation.
Throughout, N will denote a suitable integer modulus. To be suitable, N
should at least be computationally hard to factor using any modern factoring
algorithm. In practice, one often generates N as the product of two large prime
182 Craig Gentry

numbers and “ e.g., 512 bits apiece. However, one could choose N differently
for our schemes, if desired. For example, setting for can lead to
efficiency advantages, though one should be wary of setting too large [11].
Let for integers and and
suitable modulus N - i.e., the set of integers with modular squares in Let
B be shorthand for when N, and are understood.
A “lattice” consists of the set of all vectors that can be generated as integer
linear combinations of a set of basis vectors. For example, if and are
two basis vectors in two-dimensional space, the lattice that they generate is the
set of vectors

3 Distribution of Numbers with Small Modular Squares
Developing a compressed representation of numbers in that is efficiently
computable and invertible requires an understanding of how numbers in
are distributed in [0, N/ 2). The compression algorithm works, at a high level, by
taking this distribution into account.
In [33], Vall©e describes the “global” distribution of in [0,N/2) in
terms of its “local” distribution in each of a set of Farey intervals that covers
[0, N/2). She then describes each local distribution in terms of points of a lattice
that lie in the region between two parabolas. For the distri-
bution of among the Farey intervals is “quasi-independent,”
allowing her to construct an algorithm that draws integers from “quasi-
uniformly.” Since Vall©e™s analysis forms the basis of our compression algorithm,
we review it in detail in this section.

3.1 Farey Sequences
Some properties of Farey sequences are collected in [20]; we recall them below.
Definition 1 (Farey Sequence). The Farey sequence of order is the
ascending sequence of fractions with and

The characteristic property of Farey sequences is expressed in the following the-
orem [20]:
Theorem 1. If and are consecutive in then
Another useful theorem concerning Farey sequences is the following:
Theorem 2. If and are consecutive in then
The latter theorem follows from the fact that the so-
called “mediant” of and is between and and
would be in if Farey sequences lead naturally to the notion
of a Farey partition, in which the set of mediants partition the interval [0, N/2)
into subintervals. The formal definition is as follows.
How to Compress Rabin Ciphertexts and Signatures (and More) 183

Definition 2 (Farey Partition). The Farey partition of order of the interval
[0, N/ 2) is the set of intervals where is
the term in
So that each “end” of [0, N/ 2) is covered by the partition, we set
and where is the final fraction in the
Farey sequence.
Vall©e found it convenient to use another set of intervals called
“Farey intervals,” that are related to
Definition 3 (Farey Interval). The Farey interval of order is the
open interval with center and radius where is the term in
Using Theorems 1 and 2, one can easily prove that contains and
that the interval is no more than twice as wide as the interval
[1]. One can also prove that every number in [0, N/ 2) is covered by at least one,
and at most two, Farey intervals “ e.g., by showing that, for every
intersects but neither nor contains the center
of Vall©e probably favored using the Farey intervals rather than the
in her analysis, because (roughly speaking) the fact that each
is symmetric about makes her analysis cleaner. A “Farey Covering,”
which is analogous to a Farey partition, is then defined as follows.
Definition 4 (Farey Covering). The Farey covering of order of the interval
[0, N/ 2) is the set Farey intervals of order

The Connection between Farey Sequences and B™s Distribution
Although it is far from obvious, Farey sequences have a close connection with
the distribution in of integers in Vall©e observed that the gaps
between consecutive integers in B vary widely close to the rationals of
small denominator Close to these rationals, the distribution might be called
“clumpy,” with large gaps separating sequences of small gaps. However, as one
considers wider intervals centered at the distribution of B-elements
provably “evens out” “ i.e., the ratio of the number of B-elements in the in-
terval, versus the number one would expect if the B-elements were distributed
uniformly, approaches 1. Roughly speaking, the width of interval needed before
the “clumpiness” can be disregarded is inversely proportional to This is one
reason why Farey intervals are useful for analyzing B™s distribution; the diameter
of is also inversely proportional to
Building on the above observations, Vall©e ultimately proved that the number
of in is essentially proportional to the width of
(as one would expect), as long as is large enough. Formally, Vall©e proved
the following theorem [33].
Theorem 3. For and the subset and the
Farey covering of order are quasi-independent.
Vall©e defines quasi-independence as follows.
184 Craig Gentry

Definition 5 (Quasi-Independence). A subset X and a covering of
are quasi-independent if, for all the sets X and are
for some positive constants and “ i.e.,
Clearly, this definition is meaningless unless and are independent of N.
Vall©e proves that and suffice when and
This means that, for these parameters, any given Farey interval has no more
than times the “density” of than any other Farey
Interestingly, Vall©e™s proof of Theorem 3 is essentially constructive. To an-
alyze the distribution of in the “local” region Vall©e
associates each with a point that is in a particular lattice and
that lies in the region between two particular parabolas. She then partitions the
lattice into a set of parallel lines. The number of lines may be very large “ e.g.,
superpolynomial in log N. Her distribution analysis then becomes “even more
local”; she provides upper and lower bounds on how many associated lattice
points can occur on each line (except for at most 6 of the lines, for which she
only provides upper bounds). These bounds imply similar bounds on the num-
ber of in Her constructive approach results in what
one may call a “quasi-enumeration” of in in which
each element is indexed first by the line of its associated lattice point, and then
by the lattice point™s position on the line. This quasi-enumeration is crucial
to Vall©e™s “quasi-uniform” drawing algorithm (subsection 3.3), to our uniform
drawing algorithm (section 4), and to our algorithms for losslessly compressing
(section 5).
Before discussing these algorithms, we review the details of Vall©e™s analysis.
Set to be the closest integer to (the center of the Farey interval). If
is in then Now, let be the
lattice generated by the vectors and (0, N). Then, is in
precisely when there is a such that and The
latter requirement implies that is in between the two parabolas defined,
in variables and by the formulas and
Thus, if we set then each corresponds to
a lattice point in:

It may seem like a fairly complicated task to approximate how many lattice
are between the two parabolas defined above1, but, as Vall©e
points in
describes, it is possible to find a lattice basis of in which the basis vectors
are each short, with one basis vector being “quasi-horizontal” and the other
being “quasi-vertical.” The basis is with:

Indeed, finding all of the points on a single parabola is equivalent to finding
all of a number™s modular square roots, which is equivalent to factoring.

How to Compress Rabin Ciphertexts and Signatures (and More) 185

Recall that and with
Having computed this short lattice basis, Vall©e considers the distribution of
(and hence B-elements) on individual lines parallel to vecr. Each
point in lies on a quasi-horizontal line that intersects the vertical axis at
ordinate for some rational index
where and where consecutive indices differ by 1. For lines with
indices from to which intersect
the region between the two parabolas in an area she dubs the “legs” (which is
in between the “chest” and the “feet”), Vall©e proves the following theorem:
Theorem 4. The number of points in on the line with index in
the legs satisfies:
Her bounds on each individual line in the legs imply lower and upper bounds on
the total number of lattice points in the legs, using the inequalities:

For lines with indices in or
that intersect the “chest” or “feet,” Vall©e provides no nontrivial
lower bounds on the number of they may contain, only upper
bounds. For one can verify Vall©e™s results that there are at most
4 lines in the chest, each with fewer than points, and that
there are at most 2 lines in the feet, each with fewer than 8 points. Ultimately,
Vall©e proves Theorem 3 using her lower bounds for the legs, and upper bounds
for the chest, legs and feet.

3.3 Vall©e™s Quasi-uniform Drawing Algorithm
Vall©e uses the above results, particularly her lower and upper bounds for the
legs, to obtain a concrete algorithm for drawing integers from quasi-
uniformly when For a quasi-uniform drawing algorithm, the
respective probabilities of any two being drawn are within a
constant factor of each other; formally:
Definition 6 (Quasi-Uniform). A drawing algorithm C, defined over a finite
set U and with values in a subset X of is said to be (or quasi-
uniform) for constants and if, for all

186 Craig Gentry

Vall©e™s algorithm is as follows:

1. Randomly Select a Starting Point: Pick random integer with
uniform distribution.
2. Determine Farey Interval: Use continued fractions to compute for
3. Evaluate the Number of Points in Compute count
exactly the number of points in the chest and feet, and obtain a lower
bound on the number of points in the legs using Vall©e™s lower bounds
(with Equation 4).
4. Pick a Point from Randomly select an integer in
with uniform distribution. If output the appropriate point from
the chest or feet. Else, use Equation 4 to determine which quasi-horizontal
line would contain the point in the legs if each line met Vall©e™s lower
bounds, and randomly choose a point in on that line with uniform
5. Compute from the Chosen Point in Let be the lattice
point output by the previous step. Set

Remark 1. In Step 3, one can quickly can get an exact count for how many points
are in the chest and the feet by counting the exact number of points on each
line, using simple geometry. (Recall that there are at most 4 lines in the chest,
2 in the feet.) A line intersects one of the two parabolas in at most 4 locations,
possibly cutting the line into two segments that lie in between the parabolas.
After finding the first and last lattice points on each segment, extrapolating
the total number of points on each segment is easy since the of
consecutive lattice points differ by (see Equation 2). Vall©e avoids counting
the number of points on lines in the legs, since the number of lines in the legs
may be super-polynomial in log N.

The drawing algorithm outputs an that is in the same
interval as A wider interval (recall that has diameter for
and that is at least half as wide as has a higher chance
of being chosen in the first two steps. However, once an interval is chosen, any
given B-element in that interval has a lower probability of being chosen if the
interval is wide than if it is narrow. On balance, these factors even out (this is
quasi-independence), and the drawing algorithm is quasi-uniform.
In computing there are three things to consider. First, different Farey
intervals may have different “densities” of specifically, the ratio
may be as much as 20 (see discussion after Theorem 3). Second, in Step 2, we
used rather than since is between 1 and 2 times as
wide as this costs us another factor of 2. Finally, within the
interval, different lines may be closer to the lower bounds or closer to the upper
bounds, leading to a factor of Thus, is at most
How to Compress Rabin Ciphertexts and Signatures (and More) 187

4 Improving Vall©e™s and Coron™s Results
In this section, we describe how to modify Vall©e™s quasi-uniform drawing al-
gorithm to make it perfectly uniform. Our perfectly uniform drawing algorithm
gives us an immediate improvement upon Coron™s proof of security for Rabin-
PDH; in particular, it allows us to reduce the output size of the partial domain
hash function (see subsection 4.2). More generally, the fact that a simulator
can draw B-elements uniformly in responding to an adversary™s hash queries
allows us (when combined with the compression schemes of Section 5) to reduce
the bandwidth of several signature-related cryptosystems, including aggregate
signature schemes, ring signature schemes and signcryption schemes.

4.1 A Perfectly Uniform Drawing Algorithm
Modifying Vall©e™s quasi-uniform drawing algorithm to make it perfectly uni-
form is surprisingly simple. Our modification is based on our observation that,
for any (with as required by Vall©e), anyone can
efficiently compute the exact probability that Vall©e™s quasi-uniform draw-
ing algorithm will output For example, a simulator in a security proof can
compute this probability (without, of course, needing the factorization of N).
Assume, for now, that we can efficiently compute for any given Let
be a lower bound on such probabilities over all Then, the
improved drawing algorithm is as follows:
1. Use Vall©e™s method to pick an quasi-uniformly.
2. Compute
3. Goto Step 1 with probability
4. Otherwise, output
Since Vall©e™s drawing algorithm is quasi-uniform, the expected number of
“Goto” loops per draw is a small constant; thus, the simulator™s estimated time-
complexity increases only by a constant factor. The probability that is cho-
sen in Step 1 and that it “survives” Step 3 is the same for all “ namely,
for this reason, and since each run of Vall©e™s
algorithm is independent, the algorithm is perfectly uniform.
Now, given how does one (say, a simulator) compute First, the sim-
ulator determines the at most two Farey intervals and
that contain For the simulator computes the index of the quasi-
horizontal line that contains the lattice point associated to and
the exact number of lattice points on Similarly, if there is a second
Farey interval that contains the simulator computes
and Then, using the variables and from Vall©e™s draw-
ing algorithm, the probability that will be chosen is:



. 6
( 19)