. 5
( 10)


In Section VI.3.1 we showed that if (ak , bk )∞ is the AGM sequence, then

j (Eak ,bk ) ≡ j(Ek ) (mod 2k+1 ) ,

where Ek is the canonical lift of σ k (E). Expressing j(Eak ,bk ) as a function of
ak and bk shows that ak and bk must be correct modulo 2k+3 . Therefore, we
conclude that
am’3 am’3+n
Tr F ≡ + 2n (mod 2m ) .
am’3+n am’3
This procedure is summarized in Algorithm VI.5.

Algorithm VI.5: AGM

An elliptic curve E : y 2 + xy = x3 + c over F2n with
j(E) ∈ F4 .
OUTPUT: The number of points on E(F2n ).
1. m ← n + 2. √
2. Take r ∈ Zq with r ≡ c (mod 2).
3. a ← 1 + 4r, b ← 1 ’ 4r.
4. For i = 4 to m do:

(a, b) ← (a + b)/2, ab (mod 2i ).
6. a0 ← a.
7. For i = 0 to n ’ 1 do:

(a, b) ← (a + b)/2, ab (mod 2m ).
9. t ← a0 /a (mod 2m ).
10. If t2 > 2n+2 then t ← t ’ 2m .
11. Return 2n + 1 ’ t.

The complexity of Algorithm VI.5 is determined by Step 7, which requires
O(n) square root computations to precision m n/2. Since each square root
computation takes O(1) multiplications at full precision, the complexity of
Algorithm VI.5 is O(n2µ+1 ) bit-operations. The space complexity clearly is
O(n2 ), since only O(1) elements of Zq /(2m Zq ) are required. Note that it is
possible to replace the For-loop in Step 7 of Algorithm VI.5 by one AGM
iteration and a norm computation. Indeed, the trace of Frobenius satis¬es
t ≡ NQq /Qp (a0 /a1 ) (mod 2min(n,m) ). We refer to Section VI.5 for e¬cient norm
computation algorithms.
VI.3.4. Univariate AGM. Given the bivariate AGM sequence (ak , bk )∞ k=0
with ak ≡ bk ≡ 1 (mod 4) and ak ≡ bk (mod 8), we can easily de¬ne a
univariate AGM sequence (»k )∞ by »k = bk /ak . This sequence corresponds
to the elliptic curves
E»k : y 2 = x(x ’ 1)(x ’ »2 ) .

Since (ak+1 , bk+1 ) = ((ak + bk )/2, ak bk ), »k+1 follows from »k by

2 »k
»k+1 = . (VI.12)
1 + »k
In Section VI.3.1 we showed that for an ordinary curve E : y 2 + xy = x3 + c
with c ∈ F— , the AGM sequence can be initialized as (a0 , b0 ) = (1 + 4u, 1 ’ 4u)
where u ∈ Zq satis¬es u2 ≡ c (mod 2). However, these values cannot be
used to initialize the univariate AGM sequence, since (a0 , b0 ) is only correct
modulo 8, which would lead to »0 ≡ 1 (mod 8). However, this problem can

be solved easily by computing (a1 , b1 ) ≡ (1, 1 ’ 8u2 ) (mod 16) and starting
»1 ≡ 1 + 8u2 ≡ 1 + 8c (mod 16) ,
with c ∈ Zq and c ≡ c (mod 2). Unlike the bivariate AGM sequence, the uni-
variate AGM sequence does converge, in the sense that »k ≡ »k+n (mod 2k+3 ).
In Section VI.3.2 we showed that
bk+1 bk
≡ Σ( ) (mod 2k+3 ) ,
ak+1 ak
and thus »k+1 ≡ Σ(»k ) (mod 2k+3 ). Gaudry [143] suggested substituting this
in equation (VI.12) which shows that »k satis¬es
Σ(Z)2 (1 + Z)2 ’ 4Z ≡ 0 (mod 2k+3 ) and Z ≡ 1 + 8Σk’1 (c) (mod 16) .
De¬ne the polynomial Λ2 (X, Y ) = Y 2 (1 + X)2 ’ 4X; then »k is a solution
of Λ2 (X, Σ(X)) ≡ 0 (mod 2k+3 ). Recall that the j-invariant of the canonical
lift of E satis¬es a similar type of equation, i.e., ¦p (X, Σ(X)) = 0 with
¦p (X, Y ) the pth modular polynomial. The main di¬erence is that both
partial derivatives of Λ2 (X, Y ) vanish modulo 2. Therefore, we make the
change of variables X ’ 1 + 8X and Y ’ 1 + 8Y to obtain the modi¬ed
modular polynomial
Υ2 (X, Y ) = (X + 2Y + 8XY )2 + Y + 4XY .
Let γk be de¬ned by »k = 1+8γk ; then γk satis¬es Υ2 (X, Σ(X)) ≡ 0 (mod 2k )
and γk ≡ σ k’1 (c) (mod 2). The partial derivatives of Υ2 are
(X, Y ) = 2(X + 2Y + 8XY )(1 + 8Y ) + 4Y ,
(X, Y ) = (4(X + 2Y + 8XY ) + 1)(1 + 4X) ,
which shows that ‚Υ2 ≡ 0 (mod 2), but ‚Υ2 ≡ 1 (mod 2).
‚X ‚Y
The trace of Frobenius follows easily from Section VI.3.2 and is given by
TrF ≡ NQq /Qp (mod 2min(n,k) ) . (VI.13)
1 + 4γk

VI.4. Generalized Newton Iteration
In Sections VI.2.3 and VI.3.4 we solved equations of the form
“(X, Σ(X)) = 0 for “(X, Y ) ∈ Zq [X, Y ]
without computing the Frobenius substitution Σ. The aim of this section is
to show that in a well-chosen basis of Qq , the Frobenius substitution can be
computed e¬ciently, which can be used to devise faster algorithms to solve
the above equation “(X, Σ(X)) = 0.

VI.4.1. Fast Frobenius Substitution. The ¬rst construction admitting a
fast Frobenius substitution is due to Satoh, Skjernaa and Taguchi [287] and
works for all ¬nite ¬elds Fq Fp [t]/(f (t)). Let f (t) ∈ Zp [t] be the unique
polynomial de¬ned by
f (t)|(tq ’ t) and f (t) ≡ f (t) (mod p).
Then Qq can be represented as Qp [t]/(f (t)). The polynomial f (t) can be com-
puted modulo pm in O((nm)µ log m) time using an algorithm due to Harley;
a detailed description can be found in [333, 3.10.2].
As shown in Section VI.1, this implies that Σ(t) = tp and the Frobenius
substitution of an element ± = i=0 ±i ti can be computed e¬ciently as
n’1 n’1
±i tip ,
Σ(±) = Σ ±i t =
i=0 i=0

where the result is reduced modulo f (t). Note that in general the polynomial
f (t) is dense, so a fast reduction algorithm should be used, e.g. the one
described in [141, Chapter 9].
Similarly, the inverse Frobenius substitution Σ’1 can be computed as
’1 ’1 i
±pk+j tk
Σ (±) = Σ ±i t = Cj (t),
i=0 j=0 0¤pk+j<n

where Cj (t) = Σ’1 (tj ) ≡ tjp (mod f (t)). If we precompute Cj (t) for
j = 0, . . . , p’1, computing Σ (±) for ± ∈ Zq only takes p’1 multiplications
in Zq .
The second construction is due to Kim, Park, Cheon, Park, Kim and
Hahn [199] who proposed the use of ¬nite ¬elds with a Gaussian Normal
Basis (GNB) of small type. Such a basis can be lifted trivially to Zq and
allows e¬cient computation of arbitrary iterates of Frobenius. Recall that a
basis B of Qq /Qp is called normal if there exists an element β ∈ Qq such that
B = {Λ(β) | Λ ∈ Gal(Qq /Qp )}. A proof of the next proposition can be found
in [199].
Proposition VI.12. Let p be a prime and n, t positive integers such that
nt + 1 is a prime di¬erent from p. Let γ be a primitive (nt + 1)-th root of
unity in some extension ¬eld of Qp . If gcd(nt/e, n) = 1, with e the order of
p modulo nt + 1, then for any primitive tth root of unity „ in Z/(nt + 1)Z

is a normal element and [Qp (β) : Qp ] = n. Such a basis is called a Gaussian
Normal Basis of type t.

Kim et al. represent elements of Zq as elements of the ring
Zp [x]/(xnt+1 ’ 1) .
Multiplication of two elements in Zq /(pm Zq ) therefore requires O((nmt)µ )
bit-operations, so one normally restricts t ¤ 2.
For t = 1 we have β = γ and the minimal polynomial of β therefore is
xn+1 ’ 1
= xn + xn’1 + · · · + x + 1 .
f (x) =
To speed up the Frobenius substitution, Kim et al. use a redundant represen-
tation, i.e., they embed Zq in Zp [x]/(xn+1 ’ 1) by mapping ± = i=0 ±i β i to
±(x) = i=0 ±i xi + 0xn . Since Σk (β) = β p , we have
n n
±j/pk (mod (n+1)) xj .
Σ (±(x)) = ±i x = a0 +
i=0 j=1

Therefore, we can compute Σk (±) by a simple permutation of the coe¬cients
of ±(x), which only requires O(n) bit-operations.

VI.4.2. Satoh“Skjernaa“Taguchi Algorithm. Let x ∈ Zq be a root of
“(X, Σ(X)) = 0 for “(X, Y ) ∈ Zq [X, Y ] and assume we have computed the
approximation xm ≡ x (mod pm ). De¬ne δm = (x ’ xm )/pm ; then the Taylor
series expansion around xm gives
0 = “(x, Σ(x)) = “(xm + pm δm , Σ(xm + pm δm ))
≡ “(xm , Σ(xm )) + pm (δm ∆x + Σ(δm )∆y ) (mod p2m ) ,
with ∆x ≡ ‚X (xm , Σ(xm )) (mod pm ) and ∆y ≡ ‚Y (xm , Σ(xm )) (mod pm ).
‚“ ‚“

Since “(xm , Σ(xm )) ≡ 0 (mod pm ), we can divide this equation by pm and
obtain a relation for δm modulo pm :
“(xm , Σ(xm ))
+ δm ∆x + Σ(δm )∆y ≡ 0 (mod pm ) .
For simplicity we will assume that ordp (∆y ) = 0, i.e., that ∆y is a unit in Zq
and that ordp (∆x ) > 0. Reducing the above equation modulo p then gives
“(xm , Σ(xm ))
δm ≡ ’
(mod p) .
pm ∆y
Taking the unique pth root δm ∈ Fq leads to a better approximation of x given
by xm +pm δm ≡ x (mod pm+1 ). To avoid computing the pth root in the above
equation, Satoh, Skjernaa and Taguchi replace the equation “(X, Σ(X)) = 0
by “(Σ’1 (X), X) = 0, which clearly is equivalent. The new δm then becomes
“(Σ’1 (xm ), xm )
δm ≡ ’ m ‚“ ’1 (mod p) .
p ‚Y (Σ (xm ), xm )

Note that since “(Σ’1 (xm ), xm ) ≡ 0 (mod pm ), we only need to compute the
inverse of ‚Y (Σ’1 (xm ), xm ) modulo p. Implementing these ideas naively leads

to Algorithm VI.6, which can be used instead of Algorithm VI.4.

Algorithm VI.6: Satoh Skjernaa Taguchi Naive

Polynomial “(X, Y ) ∈ Zq , element x0 ∈ Zq satisfying
“(Σ’1 (x0 ), x0 ) ≡ 0 (mod p) and precision m.
OUTPUT: Element xm ∈ Zq , with “(Σ’1 (xm ), xm ) ≡ 0 (mod pm ) and
xm ≡ x0 (mod p).
d ← ‚Y (Σ’1 (x0 ), x0 )
(mod p).
y ← x0 (mod p).
For i = 2 to m do:
x ← Σ’1 (y) (mod pi ).
y ← y ’ d“(x, y) (mod pi ).
Return y.

The main problem of Algorithm VI.6 is that in each iteration it recomputes
“(x, y) although the values of x and y in Step i + 1 are very close to the ones
in Step i. Assume we have computed xW ≡ x (mod pW ) for some W . Then,
for all s ∈ N,

“ Σ’1 (xsW +i ), xsW +i ≡ “ Σ’1 (xsW ), xsW + ∆ (mod p(s+1)W ) , (VI.15)

‚“ ’1 ‚“ ’1
(Σ (xsW ), xsW )Σ’1 (δ) +
∆ = psW (Σ (xsW ), xsW )δ .
‚X ‚Y

Furthermore, note that it is su¬cient to know the partial derivatives
‚“ ’1 ‚“ ’1
(Σ (xsW ), xsW ) and (Σ (xsW ), xsW )
‚X ‚Y
modulo pW only. Using (VI.15), we can thus compute “(Σ’1 (xsW +i ), xsW +i )
from “(Σ’1 (xsW ), xsW ) as long as i < W .
Algorithm VI.7 consists of two stages: in the ¬rst stage we compute
xW ≡ x (mod pW ) using Algorithm VI.6 and in the second stage we use
equation (VI.15) to update “(x, y).

Algorithm VI.7: Satoh Skjernaa Taguchi

Polynomial “(X, Y ) ∈ Zq , element x0 ∈ Zq satisfying
“(Σ’1 (x0 ), x0 ) ≡ 0 (mod p) and precision m.
OUTPUT: Element xm ∈ Zq , with “(Σ’1 (xm ), xm ) ≡ 0 (mod pm ) and
xm ≡ x0 (mod p).
1. y ← Satoh Skjernaa Taguchi Naive(x0 , W ).
2. x ← Σ’1 (y) (mod pW ).
3. ∆x ← ‚X (x, y) (mod pW ).

4. ∆y ← ‚Y (x, y) (mod pW ).

5. For s = 1 to (m ’ 1)/W do:
x ← Σ’1 (y) (mod p(s+1)W ).
V ← “(x, y) (mod p(s+1)W ).
For i = 0 to W ’ 1 do:
δy ← ’ dp’(sW +i) V (mod p).
δx ← Σ’1 (δy ) (mod pW ’i ).
y ← y + psW +i δy (mod p(s+1)W ).
V ← V + p(sW +i) (∆x δx + ∆y δy ) (mod p(s+1)W ).
13. Return y.

Satoh, Skjernaa and Taguchi prove that for W nµ/(1+µ) , Algorithm VI.7
runs in time O(nµ mµ+1/(1+µ) ). Although the value for W minimizes the time
complexity, in practise it is faster to take W a multiple of the CPU word size.

VI.4.3. Solving Artin“Schreier Equations. Recall that if Fq is a ¬eld of
characteristic p, an equation of the form xp ’x+c = 0 with c ∈ Fq is called an
Artin“Schreier equation. Since Σ(x) ≡ xp (mod p), a natural generalization
to Zq is an equation of the form
a Σ(x) + b x + c = 0 , (VI.16)
with a, b, c ∈ Zq . Let x ∈ Zq be a solution of “(X, Σ(X)) = 0 and let
xm ≡ x (mod pm ). De¬ne δm = (x ’ xm )/pm ; then equation (VI.14) shows
that δm satis¬es an Artin“Schreier equation modulo pm with
‚“ ‚“ “(xm , Σ(xm ))
a= (xm , Σ(xm )) , b = (xm , Σ(xm )) , c = .
‚Y ‚X
Since we assumed that a is a unit in Zq , any solution δ to this Artin“Schreier
equation satis¬es δ ≡ δm (mod pm ), and thus xm + pm δ ≡ x (mod p2m ).
Therefore, if we can solve Artin“Schreier equations e¬ciently, we obtain an
algorithm to ¬nd a root of “(X, Σ(X)) = 0 that converges quadratically. This
strategy is formalized in Algorithm VI.8. Since the precision we compute with
doubles in every step, the complexity of Algorithm VI.8 is the same as the
complexity of the algorithm to solve an Artin“Schreier equation.

Algorithm VI.8: Gen Newton Lift

Polynomial “(X, Y ) ∈ Zq , element x0 ∈ Zq satisfying
“(x0 , Σ(x0 )) ≡ 0 (mod p) and precision m.
OUTPUT: Element xm ∈ Zq , with “(xm , Σ(xm )) ≡ 0 (mod pm ) and
xm ≡ x0 (mod p).
1. If m = 1 then
x ← x0 .
3. Else
m← m .
4. 2
x ← Gen Newton Lift(“, x0 , m );
y ← Σ(x ) (mod pm ).
V ← “(x , y ) (mod pm ).
∆x ← ‚X (x , y ) (mod pm ).
∆y ← ‚Y (x , y ) (mod pm ).
δ ← Artin Schreier Root (∆y , ∆x , V /pm , m ’ m ).
x ← x + pm δ (mod pm ).
12. Return x.

Lercier“Lubicz Algorithm: In [218] Lercier and Lubicz described the ¬rst
quadratically convergent algorithm to solve Artin“Schreier equations. Since
the algorithm uses iterated Frobenius substitutions, it is only e¬cient if Qq
admits a Gaussian Normal Basis of small type.
Dividing equation (VI.16) by a, we obtain
Σ(x) ’ β x ’ γ = 0 ,
with β = ’b/a, γ = ’c/a ∈ Zq . Let β1 = β, γ1 = γ; then Σ(x) = β1 x + γ1
and de¬ne Σk (x) = βk x + γk for k = 2, . . . , n by induction. Since Σn (x) = x
for all x ∈ Qq , we conclude that the unique solution to the above equation is
given by γn /(1 ’ βn ). To compute the βk , γk ∈ Zq , Lercier and Lubicz use a
simple square and multiply algorithm based on the formula
Σk+l (x) = Σl (βk x + γk ) = Σl (βk )(βl x + γl ) + Σl (γk ) .
The above formula immediately leads to Algorithm VI.9, which returns βk
and γk modulo pm for any k.
The complexity of Algorithm VI.9 is determined by Steps 6 and 7, which
require O(1) multiplications and O(1) iterated Frobenius substitutions in
Zq /(pm Zq ). For ¬elds with a Gaussian Normal Basis of small type, such a
substitution takes O(n) bit-operations as shown in Section VI.4.1 and a mul-
tiplication takes O((nm)µ ) bit-operations. Since the algorithm needs O(log n)
recursive calls, the time and space complexity for ¬elds with Gaussian Normal
Basis of small type are O((nm)µ log n) and O(nm), respectively.

Algorithm VI.9: Lercier Lubicz

INPUT: Elements β, γ ∈ Zq , power k, precision m.
OUTPUT: Elements βk , γk ∈ Zq such that Σk (x) ≡ βk x + γk (mod pm )
and Σ(x) ≡ βx + γ (mod pm ).
1. If k = 1 then
βk ← β (mod pm ), γk ← γ (mod pm ).
3. Else
k← k.
4. 2
(βk , γk ) ← Lercier Lubicz(β, γ, k , m).
βk ← βk Σk (βk ) (mod pm ).
γk ← γk Σk (βk ) + Σk (γk ) (mod pm ).
If k ≡ 1 (mod 2) then
βk ← β Σ(βk ) (mod pm ).
γk ← γ Σ(βk ) + Σ(γk ) (mod pm ).
11. Return (βk , γk ).

Harley™s Algorithm: In an e-mail to the NMBRTHRY list [158], Harley an-
nounced an algorithm to count the number of points on an elliptic curve over
Fpn which runs in O(n2µ log n) time and uses O(n2 ) space for ¬xed p. Note
that this algorithm is protected by a U.S. patent [159] ¬led by Harley. The
key idea is a doubly recursive algorithm to solve the Artin“Schreier equation
a Σ(x) + b x + c ≡ 0 (mod pm ) , (VI.17)
with a, b, c ∈ Zq , ordp (a) = 0 and ordp (b) > 0. Let m = m/2 and assume
we have an algorithm which returns a zero xm of (VI.17) to precision m .
Then we can use the same algorithm to ¬nd a solution xm to precision m.
Indeed, substituting xm = xm + pm ∆m in equation (VI.17) and dividing by
pm gives
a Σ(xm ) + b xm + c
≡ 0 (mod pm’m ) .
a Σ(∆m ) + b ∆m + m
This shows that ∆m itself also satis¬es an Artin“Schreier equation of the
form (VI.17) but to a lower precision. Since m ’ m ¤ m we can use the
same algorithm to determine ∆m (mod pm’m ) and therefore xm .
This immediately leads to a doubly recursive algorithm if we can solve the
base case, i.e., ¬nd a solution to (VI.17) modulo p. Since we assumed that
ordp (a) = 0 and ordp (b) > 0, we simply have to solve a xp + c ≡ 0 (mod p).

Algorithm VI.10: Harley

INPUT: Elements a, b, c ∈ Zq , a ∈ Z— , ordp (b) > 0, precision m.
OUTPUT: Element x ∈ Zq such that a Σ(x) + b x + c ≡ (mod pm ).
If m = 1 then
x ← (’c/a)1/p (mod p).
3. Else
m ← m ; M =m’m.
4. 2
x ← Harley(a, b, c, m ).
c ← aΣ(x p)+bx +c (mod pM ).
6. m

∆ ← Harley(a, b, c , M ).
x ← x + pm ∆ (mod pm ).
Return x.

The pth root in Step 2 of Algorithm VI.10 should not be computed by
naively taking the pn’1 th power. Instead, let Fq = Fp [θ]; then
1/p p’1
i k
ai θ = apk+j θ Cj (θ),
i=0 j=0 0¤pk+j<n

. This shows that for z ∈ Fq , we can compute
with Cj (θ) = (θ )1/p = θ
z with p ’ 1 multiplications over Fq .

The complexity of Algorithm VI.10 is determined by the recursive calls in
Steps 5 and 7 and the Frobenius substitution in Step 6.
If we assume that Zq is represented as in Section VI.4.1, the Frobenius
substitution in Zq /(pm Zq ) can be computed using O((nm)µ ) bit-operations.
If T (m) is the running time of Algorithm VI.10 for precision m, then we have

T (m) ¤ 2T ( m/2 ) + c(nm)µ ,

for some constant c. The above relation implies by induction that the time
complexity of Algorithm VI.10 is O((nm)µ log m). The space complexity is
clearly given by O(nm).

VI.5. Norm Computation
The ¬nal step in all point counting algorithms is to compute the norm
NQq /Qp (±) = i=0 Σi (±) of a unit ± ∈ Zq . In this section we give an overview
of the existing algorithms to compute NQq /Qp (±).

VI.5.1. Norm Computation I. Kedlaya [196] suggested a basic square
and multiply approach by computing
±i+1 = Σ2 (±i ) ±i for i = 0, . . . , log2 n ,
with ±0 = ±, and to combine these to recover NQq /Qp (±) = Σn’1 (±) · · · Σ(±)±.
Let n = l ni 2i , with ni ∈ {0, 1}, and nl = 1; then we can write
i+1 +2i+2 +···+2l n
Σ2 (±i i ) ,
NQq /Qp (±) =

where the sum 2i+1 +· · ·+2l is de¬ned zero for i ≥ l. This formula immediately
leads to Algorithm VI.11.

Algorithm VI.11: Norm I

INPUT: An element ± ∈ Zq with q = pn and a precision m.
OUTPUT: The norm NQq /Qp (±) (mod pm ).
1. i ← n, j ← 0, r ← 1, s ← ±.
2. While i > 0 do:
If i ≡ 1 (mod 2) then r ← Σ2 (r) s (mod pm ).
If i > 1 then s ← Σ2 (s) s (mod pm ).
j ← j + 1, i ← i/2 .
6. Return r.

This algorithm is particularly attractive for p-adic ¬elds with Gaussian
Normal Basis of small type, due to e¬cient repeated Frobenius substitutions.
In this case the time complexity is determined by the O(log n) multiplications
in Zq /(pm Zq ), which amount to O((nm)µ log n) bit-operations and the space
complexity is O(nm).
VI.5.2. Norm Computation II. Satoh, Skjernaa and Taguchi [287] also
proposed a fast norm computation algorithm based on an analytic method.
Let exp(x) = ∞ xi /i! and log(x) = ∞ (’1)i’1 (x ’ 1)i /i be the p-adic
i=0 i=1
exponential and logarithmic function. An easy calculation shows that exp(x)
converges for ordp (x) > 1/(p ’ 1) and log(x) converges for ordp (x ’ 1) > 0.
Assume ¬rst that ± is close to unity, i.e., ordp (± ’ 1) > 1/(p ’ 1); then
NQq /Qp (±) = exp TrQq /Qp (log(±)) ,
since Σ is continuous and both series converge. The dominant step in the
above formula is the evaluation of the logarithm. Using a naive algorithm,
this would take O(m) multiplications over Zq /(pm Zq ) or O(nµ mµ+1 ) bit-
Satoh, Skjernaa and Taguchi solve this problem by noting that ±p for
k ∈ N is very close to unity, i.e., ordp (±p ’ 1) > k + p’1 . If ± ∈ Zq /(pm Zq ),

then ±p is well de¬ned in Zq /(pm+k Zq ) and can be computed with O(k)
multiplications in Zq /(pm+k Zq ). Furthermore, note that
log(±) ≡ p’k (log(±p ) (mod pm+k )) (mod pm )
and that log(ap ) (mod pm+k ) can be computed with O(m/k) multiplications

over Zq /(pm+k Zq ). So if we take k m, then log(±) (mod pm ) can be
computed in O(nµ mµ+0.5 ) bit-operations.
In characteristic 2, we can use an extra trick. Since ord2 (± ’ 1) > 1, we
have ± ≡ 1 (mod 2ν ) for ν ≥ 2. Let z = ± ’ 1 ∈ 2ν Zq /(2m Zq ) and de¬ne
γ = 2+z ∈ 2ν’1 Zq /(2m’1 Zq ). Then ± = 1 + z = 1+γ and thus

γ 2j’1
log(±) = log(1 + z) = 2 .
2j ’ 1

Note that all the denominators in the above formula are odd. Reducing this
equation modulo 2m therefore leads to
γ 2j’1
log(±) ≡ log(1 + z) ≡ 2 (mod 2m ) .
2j ’ 1

To compute the trace TrQq /Qp , Satoh, Skjernaa and Taguchi proceed as
follows: de¬ne ±i ∈ Zp by log(±) ≡ i=0 ±i ti (mod pm ); then
TrQq /Qp (log(±)) ≡ ±i TrQq /Qp (ti ) (mod pm )

and each TrQq /Qp (ti ) for i = 0, . . . , n ’ 1 can be precomputed using Newton™s
TrQq /Qp (ti’j )fn’j + ifn’i ≡ 0 (mod pm ) ,
TrQq /Qp (t ) +

with f (t) = n fi ti the de¬ning polynomial of Qq ∼ Qp [t]/(f (t)).
Let ta = TrQq /Qp (log(a)) ∈ Zp . Then ordp (ta ) > 1/(p ’ 1), so ta ∈ pZp for
p ≥ 3 and ta ∈ 4Z2 for p = 2. If we precompute exp(p) (mod pm ) for p ≥ 3
or exp(4) (mod 2m ) for p = 2, then
exp(ta ) ≡ exp(p)ta /p (mod pm ) for p ≥ 3,
exp(ta ) ≡ exp(4)ta /4 (mod 2m ) for p = 2,
and we can use a simple square and multiply algorithm to compute exp(ta ).
Therefore, if ± is close to unity, then NQq /Qp (±) (mod pm ) can be computed
in O(nµ mµ+0.5 ) bit-operations and O(nm) space. Algorithm VI.12 computes
the norm of an element in 1 + 2ν Zq , with q = 2n , assuming that exp(4) and
TrQq /Qp (ti ) for i = 0, . . . , n ’ 1 are precomputed.

Algorithm VI.12: Norm II

INPUT: An element ± ∈ 1 + 2ν Zq with ν ≥ 2 and a precision m.
OUTPUT: The norm NQq /Qp (±) (mod 2m ).

1. s ← m/2 .
2. z ← ±2 ’ 1 (mod 2m+s ).
3. w ← log(1 + z) (mod 2m+s ).
4. w ← 2’s w (mod 2m ).
5. u ← 2’ν TrQq /Qp (w) (mod 2m’ν ).
6. Return exp(4)u (mod 2m ).

Equations (VI.5) and (VI.13) show that for p = 2 we can indeed assume
that ordp (± ’ 1) > 1/(p ’ 1); however, for p ≥ 3 we also need to consider the
more general situation where ± ∈ Zq is not close to unity. Let ± ∈ Fq denote
the reduction of ± modulo p and ±t ∈ Zq the Teichm¨ller lift of ±, i.e., the
unique (q ’ 1)th root of unity which reduces to ±. Consider the equality
NQq /Qp (±) = NQq /Qp (±t )NQq /Qp (±t ±) ;
then ordp (±t ± ’ 1) ≥ 1 > 1/(p ’ 1) since p is odd. Furthermore, note that
NQq /Qp (±t ) is equal to the Teichm¨ller lift of NFq /Fp (±). Satoh [286] pointed
out that the Teichm¨ller lift ±t can be computed e¬ciently as the solution of
and X ≡ ± (mod p) .
Σ(X) = X p
Note that we can apply the same algorithms as in Section VI.4.

VI.5.3. Norm Computation III. In an e-mail to the NMBRTHRY list [158],
Harley suggested an asymptotically fast norm computation algorithm based
on a formula from number theory that expresses the norm as a resultant.
The resultant itself can be computed using an adaptation of Moenck™s fast
extended GCD algorithm [252].
Let Zq ∼ Zp [t]/(f (t)) with f (t) ∈ Zp [t] a monic irreducible polynomial of
degree n. Let θ ∈ Zq be a root of f (t); then f (t) splits completely over Zq
n’1 n’1
as f (t) = i=0 (x ’ Σi (θ)). For ± = i=0 ±i θi ∈ Zq , de¬ne the polynomial
A(x) = i=0 ±i xi ∈ Zp [x]. By de¬nition of the norm and the resultant we
n’1 n’1
A(Σi (θ)) = Res(f (x), A(x)) .
NQq /Qp (±) = Σ (±) =
i=0 i=0

The resultant Res(f (x), A(x)) can be computed in softly linear time using a
variant of Moenck™s fast extended GCD algorithm [252] as described in detail
in Chapter 11 of the book by von zur Gathen and Gerhard [141]. The result
is an algorithm to compute NQq /Qp (±) (mod pm ) in time O((nm)µ log n).

VI.6. Concluding Remarks
The p-adic approach, introduced by Satoh at the end of 1999, has caused a
fundamental dichotomy in point counting algorithms that essentially depends
on the characteristic of the ¬nite ¬eld. For elliptic curves over ¬nite ¬elds Fpn
with p moderately large, the l-adic approach, i.e., the SEA algorithm, remains
the only e¬cient solution and requires O(n2µ+2 log2µ+2 p) time. This stands in
stark contrast with the p-adic approach which only requires O(n2µ log n) time
for ¬xed p. The O-constant grows exponentially in p though, which implies
that p-adic algorithms are only e¬cient for small p. The huge e¬ciency gap
between the l-adic and p-adic approaches in small characteristic is illustrated
by the following examples: for an elliptic curve over F2163 (resp. F2509 ) the
SEA algorithm runs about two (resp. three) orders of magnitude slower than
the p-adic algorithms.
In this chapter we have only considered the advances in elliptic curve
point counting algorithms. To provide a starting point, we brie¬‚y review the
most important developments for higher genus curves, especially hyperelliptic
curves. The p-adic approach currently comes in two ¬‚avours:
Canonical Lift / AGM. This method computes the Serre-Tate canonical
lift of an ordinary abelian variety and recovers the zeta function from the
action of Frobenius on a basis for the holomorphic di¬erentials. The main
references are as follows: Mestre [247, 248] generalized the AGM algorithm
to ordinary hyperelliptic curves over F2n . An optimized version by Lercier
and Lubicz [219] runs in time O(2g n2µ ) for an ordinary genus g hyperelliptic
curve over F2n . Ritzenthaler [282] extended Mestre™s algorithm to ordinary
genus 3 non-hyperelliptic curves.
p-adic Cohomology. These methods compute the action of a Frobenius
operator on p-adic cohomology groups. Di¬erent cohomology theories give
rise to di¬erent algorithms: Lauder and Wan [213] used p-adic analysis a la
Dwork [113] to show that the zeta function of any algebraic variety can be
computed in polynomial time. Lauder and Wan [214, 215] also devized an
algorithm based on Dwork cohomology to compute the zeta function of an
Artin“Schreier curve in time O(g 2µ+3 n3µ ). Kedlaya [196] gave algorithms to
compute with Monsky“Washnitzer cohomology and obtained an O(g 2µ+2 n3µ )
time algorithm to compute the zeta function of a hyperelliptic curve in odd
characteristic. Kedlaya™s algorithm was subsequently extended as follows: to
characteristic 2 by Denef and Vercauteren [96, 97]; to superelliptic curves
by Gaudry and G¨rel [144] and, ¬nally, to Cab curves by Denef and Ver-
cauteren [98]. A nice introduction to these p-adic cohomology theories can
be found in the PhD thesis of Gerkmann [148].

Hyperelliptic Curves and the HCDLP

P. Gaudry
In 1989, hyperelliptic curves were proposed by Koblitz for use in cryptog-
raphy as an alternative to elliptic curves. Recent improvements on the algo-
rithms for computing the group law tend to prove that indeed cryptosystems
based on hyperelliptic curves can be competitive. However, another interest
of the study of hyperelliptic curves “ and also of more general curves “ is
that the process called the Weil descent relates some instances of the ECDLP
to instances of the hyperelliptic discrete logarithm problem (HCDLP). The
purpose of this chapter is to give e¬cient algorithms for hyperelliptic curves
that are useful for the next chapter.

VII.1. Generalities on Hyperelliptic Curves
In this section, we recall brie¬‚y the basic de¬nitions and properties of
hyperelliptic curves and their Jacobians. A complete elementary introduction
to hyperelliptic curves that includes all the proofs can be found in [243]. For
a more general treatment see [226].
VII.1.1. Equation of a Hyperelliptic Curve. Hyperelliptic curves can
be de¬ned in an abstract manner as follows: a hyperelliptic curve is a curve
which is a covering of degree 2 of the projective line. In a more concrete
setting, a hyperelliptic curve C of genus g is a curve with an equation of the
C : Y 2 + H(X)Y = F (X),
where F and H are polynomials with deg F = 2g +1 and deg H ¤ g. Further-
more we ask the curve to be non-singular as an a¬ne curve, which translates
into the non-simultaneous vanishing of the partial derivatives of the equation.
This choice of equation is not always possible. In general, the polynomial
H(X) has degree at most g + 1 and F (X) has degree at most 2g + 2. The
condition for being able to reduce the degrees to g and 2g+1 is to have a ratio-
nal Weierstraß point, which is equivalent to saying that H(X) has a rational
root if the characteristic is two, or that 4F (X) + H(X)2 has a rational root
if the characteristic is odd. For this, we apply an invertible transformation
X = aX+b which sends this rational root to in¬nity. If the polynomials F
and H have degrees 2g + 1 and at most g, we call the corresponding equation
an imaginary model of the curve. Otherwise, the equation is said to be a real

model of the curve. In the following, we always consider an imaginary model.
Using the real model introduces some extra technicalities, since there are now
zero or two points at in¬nity.
In the case where the characteristic is odd, it is also possible to set H(X) =
0 without lost of generality. Indeed, we can apply the transformation Y =
Y + H(X)/2, which kills the crossed term. The division by two, prevents one
from doing the same in characteristic two.
A hyperelliptic curve comes with a natural involution called the hyperel-
liptic involution, which we denote by ι. If P = (x, y) is a point of C, then ι(P )
is the point (x, ’y ’ h(x)), which is also on C. The points which are ¬xed
under the action of ι are called the rami¬cation points and they are actually
the Weierstraß points of C.
Just like for elliptic curves, we manipulate the curve via an a¬ne equation,
but we virtually always work with the projective model. For that we denote
by ∞ the point of C at in¬nity, which is unique in the imaginary model. We
call a¬ne points the points of C that are not ∞.

VII.1.2. The Jacobian of a Hyperelliptic Curve. Let C be a hyperel-
liptic curve. Similarly to elliptic curves, C comes with a group associated to
it. However, the group law is no longer de¬ned over the curve itself but on
another object called the Jacobian of C and denoted by JC . First, we need to
consider the group of divisors on C: these are the ¬nite formal sums of points
of C over Z:

ni ∈ Z,
Div(C) = D= ni Pi ; ni = 0 for almost all i .
Pi ∈C

The degree of a divisor D = ni Pi is the sum of the coe¬cients deg D = ni
and the set of points Pi for which ni is di¬erent from 0 is called the support
of D. The set of divisors of degree 0 forms a subgroup of Div(C).
The function ¬eld K(C) of C is the set of rational functions on C. Formally,
K(C) is de¬ned as the following ¬eld of fractions:
K(C) = Frac K[X, Y ]/(Y 2 + H(X)Y ’ F (X)) .
To each non-zero function • in K(C) we can associate a divisor div(•) =
νPi (•)Pi , where νPi (•) is an integer de¬ned as follows: if • has a zero at
Pi , then νPi (•) is the multiplicity of this zero; if • has a pole at Pi , νPi (•)
is the opposite of the order of this pole. Otherwise, νPi (•) = 0. A non-
zero function has only ¬nitely many zeroes and poles, therefore almost all
the νPi (•) are zero and the divisor div(•) is well de¬ned. Such a divisor is
called a principal divisor. Furthermore, it can be shown that the degree of a
principal divisor is always zero and that the set of the principal divisors is a
subgroup of the group of degree 0 divisors. We de¬ne the Jacobian JC of C
to be the quotient group of the degree 0 divisors by the principal divisors. A

consequence of Riemann“Roch theorem is that every element of the Jacobian
can be represented by a divisor of the form
D = P1 + P2 + · · · + Pr ’ r∞,
where, for all i, the point Pi is an a¬ne point of C and the integer r is less
than or equal to the genus. If furthermore we ask that Pi be di¬erent from
ι(Pj ) for all pairs i = j, then this representation is unique. Such a divisor,
with Pi = ι(Pj ) for i = j, is called reduced.
Until now, we did not pay attention to the ¬eld of de¬nition of the objects
we considered. Usually, the good way to say that an object is de¬ned over a
¬eld K is to say that it is invariant under the action of any element of the
Galois group Gal(K/K). In the case of divisors, this de¬nition is indeed the
one we want in the sense that it makes it possible to de¬ne the elements of
the Jacobian de¬ned over K as a subgroup of the Jacobian de¬ned over K.
However, this implies in particular that a divisor de¬ned over K can include
some points Pi in its support which are not de¬ned over K. In that case all
the Galois conjugates of Pi appear within D with the same multiplicities as Pi .
From the practical point of view, it is annoying to have to work in extensions
of the base ¬elds, even when working with elements which are de¬ned over
the base ¬eld. The fact that a point always comes with its conjugates leads
to the following representation of reduced divisors.
Proposition VII.1. Let C be a hyperelliptic curve of genus g de¬ned over
a ¬eld K with equation Y 2 + H(X)Y = F (X). Then the elements of the
Jacobian of C that are de¬ned over K are in one-to-one correspondance
with the pairs of polynomials (a(X), b(X)) with coe¬cients in K, such that
deg(b) < deg(a) ¤ g, the polynomial a is monic, and a divides b2 + bH ’ F .
If a and b are two polynomials that satisfy these conditions, we denote by
div(a, b) the corresponding element of JC .
Let D = 1¤i¤r Pi ’ r∞ be a reduced divisor and denote by (xi , yi ) the
coordinates of Pi . Then the a-polynomial is given by a(X) = 1¤i¤r (X ’
xi ) and the b-polynomial is the only polynomial of degree less than r such
that b(xi ) = yi . In the case where the same point occurs several times, the
multiplicity must be taken into account to properly de¬ne b(X). From now
on, when we talk about a reduced divisor, we are actually thinking about this
representation with the two polynomials a and b.
The weight of a reduced divisor D = div(a, b) is the degree of the polyno-
mial a. We denote it by wt(D), and in the representation
Pi ’ r∞

the weight is given by r. By de¬nition, the weight of a reduced divisor is an
integer in [0, g]. The neutral element of JC is the unique weight zero divisor
div(1, 0).

VII.1.3. Hyperelliptic Curves De¬ned over a Finite Field. When the
base ¬eld K = Fq is ¬nite of cardinality q, the set of (a, b) pairs of polynomials
that we can form is ¬nite, and therefore the Jacobian is a ¬nite abelian
group. Similarly to the case of elliptic curves, the Frobenius endomorphism
• obtained by extending the map x ’ xq to the Jacobian plays a central role
in the question of evaluating the cardinalities. Due to the theorem of Weil,
the characteristic polynomial of • is of the form
P (T ) = T 2g + a1 T 2g’1 + · · · + ag T g + qag’1 T g’1 + · · · + q g’1 a1 T + q g ,
where ai is an integer for all i. Furthermore, the cardinality of the Jacobian

is given by P (1), and all the roots of P (T ) have absolute value q. There
is also a link between the ai and the cardinality of C over the extensions Fqi
for 1 ¤ i ¤ g. We skip the details that can be found for instance in [226] or
[319] and we give the bounds that we obtain:
Theorem VII.2. Let C be a hyperelliptic curve of genus g de¬ned over a
¬nite ¬eld Fq with q elements. Then we have
√ √
( q ’ 1)2g ¤ #JC (Fq ) ¤ ( q + 1)2g
and √
|#C(Fq ) ’ (q + 1)| ¤ 2g q.
This theorem shows that indeed the Jacobians of hyperelliptic curves de-
¬ned over a ¬nite ¬eld are potentially good candidates for cryptosystems
based on the discrete logarithm problem in an abelian group. Indeed, to get
a group of size N , it is necessary to have a curve of genus g de¬ned over a
¬nite ¬eld with about N 1/g elements. If the group law is not too complicated
compared to elliptic curves, then the higher number of operations can be
somewhat balanced by the fact that they take place in a ¬eld whose bit-size
is g times smaller.
VII.2. Algorithms for Computing the Group Law
In this section we consider the problem of computing e¬ectively the group
law in the Jacobian of a hyperelliptic curve. Having a good algorithm for
the group law is crucial when one wants to design a cryptosystem based on
hyperelliptic curves.
In all this section, we consider a hyperelliptic curve C of genus g de¬ned
over a ¬eld K (not necessarily ¬nite) by the equation Y 2 + H(X)Y = F (X).
VII.2.1. Algorithms for Small Genus. The ¬rst algorithm to compute
in JC is due to Cantor [59] and is a translation of Gauss™s composition and
reduction algorithms for quadratic forms. Here we give another algorithm
described in [275] and which goes back to Lagrange. In general this variant
saves many operations in the reduction phase. However, for genus 2 this is
exactly Cantor™s algorithm.

Algorithm VII.1: Lagrange™s Algorithm for Hyperelliptic Group Law

INPUT: Two reduced divisors D1 = div(a1 , b1 ) and D2 = div(a2 , b2 ).
OUTPUT: The reduced divisor D3 = div(a3 , b3 ) = D1 + D2 .
1. Use two extended GCD computations, to get
d = gcd(a1 , a2 , b1 + b2 + H) = u1 a1 + u2 a2 + u3 (b1 + b2 + H).
2. a3 ← a1 a2 /d2
3. b3 ← b1 + (u1 a1 (b2 ’ b1 ) + u3 (F ’ b2 ’ b1 H))/d (mod a3 )
4. If deg(a3 ) ¤ g then return div(a3 , b3 ).
5. a3 ← a3 , ˜3 ← b3 .
˜ b
6. a3 ← (F ’ b3 H ’ b2 )/a3 .
7. q, b3 ← Quotrem(’b3 ’ H, a3 ).
8. While deg(a3 ) > g do
t ← a3 + q(b3 ’ ˜3 ).
˜ b
˜3 ← b3 , a3 = a3 , a3 ← t.
b ˜
q, b3 ← Quotrem(’b3 ’ H, a3 ).
12. Return div(a3 , b3 ).

The important case where D1 = D2 must be treated separately; in-
deed some operations can be saved. For instance, we readily know that
gcd(a1 , a2 ) = a1 .
To analyze the run time of this algorithm, we evaluate how the number of
base ¬eld operations grows with the genus. The composition steps require one
to perform the Euclidean algorithm on polynomials of degree at most g and a
constant number of basic operations with polynomials of degree at most 2g.
This can be done at a cost of O(g 2 ) operations in the base ¬eld. The cost of
the reduction phase is less obvious. If the base ¬eld is large enough compared
to the genus, each division is expected to generate a remainder of degree
exactly one less than the modulus. The quotients q involved in the algorithm
are then always of degree one and one quickly see that the cost is again O(g 2 )
operations in the base ¬eld for the whole reduction process. It can be shown
[317] that even if the degrees of the quotients behave in an erratic way, the
heavy cost of a step where the quotient is large is amortized by the fact that
we have less steps to do and ¬nally the overall cost of reduction is always
O(g 2 ) base ¬eld operations.
VII.2.2. Asymptotically Fast Algorithms. In the late eighties, Shanks
[298] proposed a variant of Gauss™s algorithm for composing positive de¬nite
quadratic forms. The main feature of this algorithm, called NUCOMP, is that
most of the operations can be done with integers half the size of the integers
that are required in the classical algorithm. This makes a great di¬erence if

we are in the range where applying NUCOMP allows us to use machine word
sized integers instead of multiprecision integers.
In the classical algorithm, just like in Lagrange™s algorithm, the compo-
sition step produces large elements that are subsequently reduced. The idea
of the NUCOMP algorithm is to more or less merge the composition and the
reduction phases, in order to avoid those large elements. A key ingredient is
the partial GCD algorithm (also known as the Half GCD algorithm, which is
related to the continued fractions algorithm). This is essentially a Euclidean
algorithm that we stop in the middle. In NUCOMP we need a partial ex-
tended version of it, where only one of the coe¬cients of the representation
is computed.
In [182], it is shown that the NUCOMP algorithm can be adapted to
the case of the group law of a hyperelliptic curve. It su¬ces more or less to
replace integers by polynomials. We recall the Partial Extended Half GCD
algorithm here for completeness:

Algorithm VII.2: Partial Extended Half GCD (PartExtHGCD)

Two polynomials A and B of degree ¤ n and a bound
OUTPUT: Four polynomials a, b, ±, β such that
a ≡ ±A (mod B) and b ≡ βA (mod B)
and deg a and deg b are bounded by m.
An integer z recording the number of operations
± ← 1, β ← 0, z ← 0, a ← A mod B, b ← B.
While deg b > m and a = 0 do
q, t ← Quotrem(b, a).
b ← a, a ← t, t ← β ’ q±, β ← ±. ± ← t.
z ← z + 1.
If a = 0 then
Return (A mod B, B, 1, 0, 0).
If z is odd then
β ← ’β, b ← ’b.
Return (a, b, ±, β, z).

We are now ready to give the NUCOMP algorithm as described in [182].
We restrict ourselves to the case of odd characteristic where the equation of
the curve can be put in the form Y 2 = F (X). The minor modi¬cations to
handle the characteristic 2 case can be found in [182].

Algorithm VII.3: NUCOMP Algorithm

INPUT: Two reduced divisors D1 = div(u1 , v1 ) and D2 = div(u2 , v2 ).
OUTPUT: The reduced divisor D3 = D1 + D2 .
1. w1 ← (v1 2 ’ F )/u1 , w2 ← (v2 2 ’ F )/u2 .
2. If deg w2 < deg w1 then exchange D1 and D2 .
3. s ← v1 + v2 , m ← v2 ’ v1 .
4. By Extended GCD, compute G0 , b, c such that
G0 = gcd(u2 , u1 ) = bu2 + cu1 .
5. If G0 |s then
By Extended GCD, compute G, x, y such that
G = gcd(G0 , s) = xG0 + ys.
H ← G0 /G, By ← u1 /G, Cy ← u2 /G, Dy ← s/G.
l ← y(bw1 + cw2 ) mod H, Bx ← b(m/H) + l(By /H).
9. Else
G ← G0 , Ax ← G, Bx ← mb.
By ← u1 /G, Cy ← u2 /G, Dy ← s/G.
12. (bx , by , x, y, z) ← PartExtHGCD(Bx mod By , By , (g + 2)/2).
13. ax ← Gx, ay ← Gy.
14. If z = 0 then
u3 ← by Cy , Q1 ← Cy bx , v3 ← v2 ’ Q1 .
16. Else
cx ← (Cy bx ’ mx)/By , Q1 ← by cx , Q2 ← Q1 + m.
dx ← (Dy bx ’ w2 x)/By , Q3 ← ydx , Q4 ← Q3 + Dy .
dy ← Q4 /x, cy ← Q2 /bx .
u3 ← by cy ’ ay dy , v3 ← (G(Q3 + Q4 ) ’ Q1 ’ Q2 )/2 mod u3 .
21. While deg(u3 ) > g do
u3 ← (v3 2 ’ F )/u3 , v3 ← ’v3 mod u3 .
23. Return div(u3 , v3 ).

In the NUCOMP algorithm, if the input divisors are distinct, then it is
likely that G0 = 1, and then Steps 5“8 are skipped. On the other hand, in the
case of doubling a divisor, these steps are always performed whereas Step 4
is trivial. Hence it is interesting to have another version of NUCOMP that is
speci¬c to doubling. Shanks had such an algorithm which he called NUDPL
and it has been extended to our case in [182].
Steps 21“22 are there to adjust the result in the case where the divisor
div(u3 , v3 ) is not completely reduced at the end of the main part of the al-
gorithm. In general, at most one or two loops are required. Therefore the
NUCOMP algorithm requires a constant number of Euclidean algorithms on
polynomials of degree O(g) and a constant number of basic operations with
polynomials of degree O(g). Thus the complexity is O(g 2 ), again, just as for

Lagrange™s algorithm. However, as we said before, the NUCOMP algorithm
works with polynomials of smaller degree (in practice rarely larger than g).
Up to now we have considered naive algorithms for polynomial multipli-
cations and GCDs. If we used an FFT-based algorithm, two polynomials
of degree O(n) can be multiplied in time O(n log n log log n) operations in
the base ¬eld. It is then possible to design a recursive GCD algorithm that
requires O(n log2 n log log n) operations [141]. Furthermore, the same algo-
rithm can also be used to computed the Partial Extended Half GCD in the
same run time (this is actually a building block for the fast recursive GCD
computation). Plugging those algorithms in NUCOMP and NUDUPL yields
an overall complexity of O(g log2 g log log g) which is asymptotically faster
than the O(g 2 ) complexity of Lagrange™s algorithm.

VII.2.3. Which Algorithm in Practice. When the genus is large, the
best algorithm is not the same as when the genus is small. In particular,
for very small genus, it is possible to write down the explicit formulae cor-
responding to the Cantor or Lagrange algorithm. Then some operations can
be saved by computing only the coe¬cients that are necessary. For instance,
the quotient of two polynomials costs less if it is known in advance that the
division is exact (see [258]). For genus 2 and 3, such optimized formulae have
been worked out by many contributors. We refer to [212] for a survey of the
current situation in genus 2, and to [277] for genus 3. In the case of genus 2,
the best formulae for adding (resp. doubling) have a cost of 25 multiplications
and 1 inversion (resp. 27 multiplications and 1 inversion).
When the genus is larger than 3, explicit formulae are not available and
the best algorithm is then Lagrange™s algorithm.
According to [182], the genus for which the NUCOMP and NUDUPL
algorithms are faster than Lagrange™s algorithm is around 10. This value de-
pends highly on the implementation of those two algorithms, on the particular
type of processor that is used, and on the size of the base ¬eld.
It is likely that the asymptotically fast algorithm for computing GCD will
never be useful for practical applications: the point at which it becomes faster
than the classical Euclidean algorithm is around genus 1000.
We ¬nally mention that an analogue of projective coordinates for elliptic
curves has been designed. This allows one to perform the group law without
any inversion, which can be crucial in some constrained environments such as
smart cards. In addition, in the spirit of [85], di¬erent types of coordinates
can be mixed to have a faster exponentiation algorithm [212].

VII.3. Classical Algorithms for HCDLP
The hyperelliptic curve discrete logarithm problem (HCDLP) is a gener-
alization of the elliptic curve discrete logarithm problem (ECDLP). As such
it comes as no surprise that every known attack on the ECDLP, for some

particular instances, can be extended to an attack on the HCDLP, for the cor-
responding weak instances. We review all these attacks that we call classical,
in opposition to the index-calculus attacks which are speci¬c to hyperelliptic
curves. We assume that the reader is familiar with all attacks on ECDLP
that are listed in Chapter V of [ECC].
VII.3.1. The Simpli¬cation of Pohlig and Hellman. Let G denote an
abelian group of known order N whose factorization is also known. The algo-
rithm of Pohlig and Hellman reduces the computation of a discrete logarithm
in G to O(log N ) computations of discrete logarithms in subgroups of G of
order p where p is a prime factor of N . The method works for a generic group
and in particular can be applied to HCDLP. It means that solving a discrete
logarithm problem in the Jacobian of a hyperelliptic curve is not harder than
the discrete logarithm problem in the largest prime order subgroup of the
VII.3.2. The MOV and Frey“R¨ck Attacks. Let C be a hyperelliptic
curve of genus g over the ¬nite ¬eld Fq and let us consider a DL problem in a
subgroup of JC of prime order m. Denote by k the smallest integer such that
m divides q k ’ 1, i.e., k is the order of q modulo m. Then there is an e¬cient
algorithm that converts the initial discrete logarithm problem into a discrete
logarithm problem in the multiplicative group of the ¬nite ¬eld Fqk . If k
is small enough, the existence of a subexponential algorithm for computing
discrete logarithms in ¬nite ¬elds can lead to an e¬cient way of solving the
initial problem in JC .
For more details on the method to transport the HCDLP to a classical
DLP we refer to Chapter IX. Indeed the key ingredients are the Weil and
Tate pairings which can, as we shall see in Chapter X, also be used in a
constructive way.
VII.3.3. The R¨ck Attack. The previous attack works only if the order m
of the subgroup is prime to the ¬eld characteristic (otherwise the parameter k
is not even de¬ned). In the case where m is precisely the ¬eld characteristic,
R¨ck [283] has extended the so-called anomalous curves attack (see [ECC,
Section V.3]) to hyperelliptic curves. He proved
Theorem VII.3. Let C be a hyperelliptic curve of genus g de¬ned over a
¬nite ¬eld of characteristic p. The discrete logarithm problem in a subgroup
of JC of order p can be solved with at most O(log p) operations in the base
¬eld and in the Jacobian.
In fact R¨ck proved his result in a more general setting where the curve
does not need to be hyperelliptic.
The principle is the following: let D be a degree 0 divisor whose class
in JC is of order p. Then, by de¬nition, there exists a function • such that
div(•) = p · D. The holomorphic di¬erential d•/• can be expressed using a

local parameter t at the point at in¬nity ∞: we can write d•/• = ‚•/‚t

We can then de¬ne the power series

ai ti ,
• i=0

where the ai are scalars. The key point is that the function that maps the
class of D to the 2g ’ 1 ¬rst coe¬cients (a0 , . . . , a2g’2 ) is an homomorphism
of additive groups. The discrete logarithm problem is then transported into
a group where it is easily solvable.
The explicit computation of this homomorphism can be done e¬ciently
using a technique similar to the method used for computing the Weil pairing.

VII.3.4. Baby Step/Giant Step and Random Walks. The generic al-
gorithms, such as the Baby Step/Giant Step algorithm, as described in Sec-
tion V.4 of [ECC], can be applied to compute discrete logarithms in any
abelian group. In particular, it is straightforward to obtain an algorithm for
solving the HCDLP in a subgroup of order N of the Jacobian of a hyperelliptic

curve in at most O( N ) group operations.
Also, the parallelized Pollard lambda method of van Oorschot and Wiener

[266] directly applies to HCDLP, providing a O( N ) algorithm with a speed-
up by a factor of M if M machines are used and requiring a small amount of
Furthermore the presence of equivalence classes arising from the hyperel-
liptic involution as well as any other quickly computable automorphism can
be used just like for elliptic curves [111], [138].

VII.4. Smooth Divisors
In 1994, Adleman, DeMarrais and Huang [4] discovered a subexponential
algorithm for computing discrete logarithms in the Jacobian of hyperelliptic
curves of “large genus.” Compared with points of elliptic curves, divisors in
the Jacobian of a hyperelliptic curve carry more structure. Indeed, a reduced
divisor is represented using two polynomials (a, b), and the factorization of
a as a polynomial in K[X] is somewhat compatible with the Jacobian group
law. This fact is the key stone for de¬ning a smoothness notion and then the
Adleman, DeMarrais and Huang index-calculus algorithm.
Lemma VII.4. Let C be a hyperelliptic curve over a ¬nite ¬eld Fq . Let D =
div(a, b) be a reduced divisor in JC (Fq ). Write a(X) = ai (X) as the factor-
ization in irreducible factors of a(X) in Fq [X]. Let bi (X) = b(X) (mod ai (X)).
Then Di = div(ai , bi ) is a reduced divisor and D = Di in JC .
With this result, a reduced divisor can be rewritten as the sum of reduced
divisors of smaller size, in the sense that the degree of their a-polynomial is

smaller. More precisely, we have wt(D) = wt(Di ). The divisors that can
not be rewritten are those for which the a-polynomial is irreducible.
Definition VII.5. A reduced divisor D = div(a, b) is said to be prime if the
polynomial a is irreducible.
Definition VII.6. Let B be an integer. A reduced divisor D = div(a, b) is
said to be B-smooth if the polynomial a is the product of polynomials of degree
at most B.
This de¬nition is very analogous to the smoothness notion that is used to
design index-calculus algorithms in the multiplicative group of a ¬nite ¬eld.
The next step is to study the proportion of smooth divisors.
We start with the easy case where B = 1, which will be of great impor-
tance in applications.
Proposition VII.7. Let C be a hyperelliptic curve of genus g over the ¬nite
¬eld Fq . Then the proportion of 1-smooth reduced divisors in JC (Fq ) tends to
1/g! when g is ¬xed and q tends to in¬nity.
The proof relies on Theorem VII.2. The 1-smooth reduced divisors are
formed using at most g points of C(Fq ), and the order does not matter. There-
fore there are about q g /g! reduced divisors that are 1-smooth. This number
should be compared with #JC (Fq ) ≈ q g . The error terms given by the precise
bounds are an order of magnitude smaller, so that we obtain the announced
If we do not ¬x B = 1, intuitively we expect the same kind of behaviour as
for polynomials over ¬nite ¬elds for which taking a subexponential smooth-
ness bound yields a subexponential proportion of smooth elements. This
heuristic has been made precise by Enge and Stein [118]. We recall the def-
inition of the so-called subexponential function parametrized by a real ± in
[0, 1] and a constant c:
LN (±, c) = exp c(log N )± (log log N )1’± .
Theorem VII.8 (Enge and Stein). Let C be a hyperelliptic curve of genus
g de¬ned over the ¬nite ¬eld Fq . Let ρ be a positive constant and let B =
logq Lqg ( 1 , ρ) be a smoothness bound. Then the number of B-smooth reduced
divisors in JC (Fq ) is at least
1 1
q g · Lq g , ’ + o(1) .
2 2ρ
It is worth noting that B is de¬ned as the ceiling of the usual expres-
sion. Indeed, as a degree, it has to be an integer. This looks ino¬ensive
but is actually the heart of the main problem that occurs below, namely the
subexponential behaviour can be observed only if the genus is large enough
compared to the size of the base ¬eld. If the genus is too small, the smooth-
ness bound would be smaller than 1, and we are back to the situation of the
previous proposition.

VII.5. Index-Calculus Algorithm for Hyperelliptic Curves
The original algorithm by Adleman, DeMarrais and Huang is highly heuris-
tic and more theoretical than of practical use. Their idea was to work with
functions on the curve. This idea was pushed further by Flassenberg and
Paulus [121] who were able to design a sieving technique, and they did the
¬rst implementation of this kind of algorithm. Later on, M¨ller, Stein and
Thiel [257] and Enge [116] managed to get rid of the heuristics, by working
with elements of the Jacobian instead of functions, so that Theorem VII.8
could be applied.
The method we shall now describe is the further improvement by Enge
and Gaudry [117]. We give a version which is close to what is really done
when it is implemented; there exists also a more complicated version that has
a heuristic-free complexity.
We ¬rst present a sketch of the algorithm, and then details about each
phase are given.

Algorithm VII.4: Hyperelliptic Index-Calculus Algorithm

A divisor D1 in JC (Fq ) with known order N = ord(D1 ),
and a divisor D2 ∈ D1
OUTPUT: An integer » such that D2 = »D1
1. Fix a smnoothness bound and construct the factor basis.
2. While not enough relations have been found do:
Pick a random element R = ±D1 + βD2 .
If R is smooth, record the corresponding relation.
Solve the linear algebra system over Z/N Z.
Return ».

VII.5.1. Construction of the Factor Basis. First a smoothness bound
B is chosen. In the complexity analysis, we shall explain how to get the best
value for B. Then the factor basis F contains all the prime reduced divisors
of weight at most B:
F = {P ∈ JC : P is prime, wt(P ) ¤ B}.
This set can be constructed in a naive way: for each monic polynomial of
degree at most B, check if it is irreducible and if it is the a-polynomial of
a reduced divisor. In that case, ¬nd all the compatible b-polynomials and
add the corresponding divisors to F. For convenience, we give names to the
elements of F:
F = {gi : i ∈ [1, #F]}.

VII.5.2. A Pseudo-Random Walk. Selecting a random element R =
±D1 + βD2 is costly: the values of ± and β are randomly chosen in the
interval [1, N ] and then two scalar multiplications have to be done. Using
the binary powering method, the cost is O(log N ) group operations. We use
a pseudo-random walk instead, so that the construction of each new random
element costs just one group operation.
The pseudo-random walk is exactly the same as the one which is used in
[323] in discrete log algorithms based on the birthday paradox. For j from 1
to 20 we randomly choose aj and bj in [1, N ] and compute the “multiplier”
Tj ← aj D1 + bj D2 . In our case where the group is traditionally written
additively, “summand” would be a better name but we stick to the classical
name. The pseudo-random walk is then de¬ned as follows:
• R0 is given by ±0 D1 + β0 D2 where ±0 and β0 are randomly chosen in
[1, N ].
• Ri+1 is de¬ned by adding to Ri one of the multipliers Tj . The index
j is given by the value of a hash function evaluated at Ri . Then the
representation of Ri+1 in terms of D1 and D2 is deduced from the
corresponding represenation of Ri : we set ±i+1 = ±i + aj and βi+i =
βi + bj for the same j.
Practical experiments [323] suggest that by taking 20 multipliers the
pseudo-random walk behaves almost like a purely random walk. In our case,
it is not really necessary to take j deterministic in terms of Ri : picking a
random multiplier at each step would do the job. The determinism in the
pseudo-random walk was necessary in [323] to use distinguished points, but
it is not used in our algorithm.

VII.5.3. Collection of Relations. Each time we produce a new random
element R = ±D1 + βD2 , we test it for smoothness. If R is not smooth, then
we continue the pseudo-random walk to get another element. The smoothness
test is done by factoring the a-polynomial of R. If all its irreducible factors
have degree at most B, then Lemma VII.4 allows us to write R as a sum of
elements of F. The kth smooth element Sk = ±k D1 + βk D2 that is found is
stored in the kth column of a matrix M = (mik ) which has #F rows:

Sk = mik gi = ±k D1 + βk D2 .

We also store the values of ±k and βk for later use.

VII.5.4. Linear Algebra. After #F + 1 relations have been collected, the
matrix M has more columns than rows. As a consequence, there exists a
non-trivial vector (γk ) in its kernel. Then by construction we have

γk Sk = 0 = γk ±k D1 + γk βk D2 .

Now, with high probability, γk βk is di¬erent from 0 modulo N and we can
compute the wanted discrete logarithm
» = ’( ±k γk )/( βk γk ) (mod N ).

To compute the vector (γk ), we could use Gaussian elimination. However, we
can do better, because the matrix is very sparse. Indeed, in the kth column, a
non-zero entry at index i means that the ith element of F has a a-polynomial
which divides the a-polynomial of the reduced divisor Sk , the degree of which
is bounded by g, the genus of C. Hence there are at most g non-zero entries in
each column. For such sparse matrices, Lanczos™s algorithm or Wiedemann™s
algorithm are much faster than Gaussian elimination, though they are not

VII.6. Complexity Analysis
In our analysis, we denote by cJ the cost of one group operation in JC (Fq ),
by cF the cost of testing the B-smoothness of a polynomial of degree g over
Fq and by cN the cost of an operation in Z/N Z. With Lagrange™s algorithm
and naive multiplication algorithms in Fq , we obtain cJ = O(g 2 log2 q), and if
we use NUCOMP and asymptotically fast algorithms everywhere, we obtain
cJ = O(g log q µ(g, log q)), where µ(g, log q) is bounded by a polynomial in
log g and log log q. The B-smoothness test is the ¬rst phase of the Cantor-
Zassenhaus algorithm called Distinct-Degree-Factorization, and its costs is
cF = O(Bg 2 log3 q) if naive multiplication algorithms are used.

VII.6.1. Analysis When g is Large Enough. We write the smoothness
bound B in the form
B = logq Lqg ( , ρ) ,
where ρ is some positive real number. By Theorem VII.8, the number of
B-smooth reduced divisors in JC (Fq ) is at least
1 1
q g · Lq g ,’ ,
2 2ρ
so that the expected number of steps in the pseudo-random walk before ¬nd-
ing a B-smooth divisor is about Lqg 1 , 2ρ . We need #F + 1 relations,
therefore the expected number of steps in the pseudo-random walk that are
required to ¬ll in the matrix is about
#F · Lqg , .
2 2ρ
The cardinality of the factor basis is bounded by a constant times the number
of polynomials of degree at most B. Hence #F is O(q B ) = O(Lqg ( 1 , ρ)).

Putting all of this together, we see that the algorithm requires
1 1
O Lq g ,ρ +
2 2ρ
steps in the pseudo-random walk. At each step, we need to perform one group
operation (which costs cJ ) and the factorization of the a-polynomial of the
divisor (which costs cF ).
Once the matrix M has been computed, we need to ¬nd a vector of the
kernel of M . This matrix has size #F and has at most g non-zero entries per
column. The cost of Lanczos™s algorithm is then
O (#F)2 g = O g Lqg , 2ρ
operations in Z/N Z.
Thus the total run time of the algorithm is
1 1 1
O (cJ + cF ) · Lqg + (gcN ) · Lqg
,ρ + , 2ρ .
2 2ρ 2
To analyze this run time and ¬nd the optimal value for ρ, we remark that
cJ , cF , cN and g are all polynomials in g and log q, whereas the Lqg (...) parts
are subexponential in those parameters. Therefore it makes sense to neglect
them when trying to balance both phases. We are then left with minimizing
the function
1 1 1 1 1
, 2ρ ≈ Lqg
Lq g ,ρ + + Lqg , max(ρ + , 2ρ) .
2 2ρ 2 2 2ρ
An easy calculation shows that the minimum is obtained for ρ = With
this choice of ρ, we obtain an overall complexity of
O (cJ + cF + gcN ) · Lqg ,2 .
Before putting this result into a theorem, we need to make clear an abusive
simpli¬cation that we did along the way: when evaluating the size √ the factor
basis, we “forgot” the ceiling sign for B. However, if logq Lqg ( 2 , 2) is close
to 0, the fact that we round it to 1 introduces an exponential factor in the
complexity analysis. Therefore, what we did makes sense only if this value
is large enough so that the rounding has no e¬ect on the complexity. This
requirement is full¬lled if and only if g is large enough compared to log q.
Theorem VII.9 (Enge“Gaudry[117]). There exists an algorithm that can

solve the HCDLP in time O Lqg 1 , 2 when q and g grow to in¬nity with
the condition that g stays “large enough” compared to log q.
The complexity of the algorithm can be made more precise by assuming
that there exists a constant ‘ such that g > ‘ log q. Then the run time is in

O Lqg 1 , c(‘) , where c(‘) is an explicit function that tends to 2 when ‘
tends to in¬nity.

VII.6.2. Complexity When g is Fixed. In this subsection, we analyze
the behaviour of Algorithm VII.4 when g is not large compared to log q, and in
particular when g is ¬xed and q tends to in¬nity. In that case, the theoretical

optimal smoothness would be logq Lqg ( 1 , 2), which tends to 0. Therefore
we assume from now on that B = 1.
By Proposition VII.7, the probability for a divisor to be 1-smooth is g! .
Note that in our algorithm, we take random elements not in the whole Ja-
cobian but in the subgroup generated by D1 . Heuristically, as long as the
subgroup is large enough compared to g!, the smoothness probability of g!
still holds.
The expected time for ¬nding enough relations to ¬ll in the matrix is then
(cJ + cF )g!q, and the time for the linear algebra is O(cN gq 2 ). Putting this
together, we obtain
Theorem VII.10 (Gaudry[142]). There exists an algorithm that can solve
the HCDLP in large enough subgroups in time O((g 2 log3 q)g!q + (g 2 log q)q 2 ),
when q tends to in¬nity and g is ¬xed.
VII.6.3. Very Small Genera. In the case where the genus is ¬xed and very
small, Harley proposed an improvement to the previous technique. Taking
B = 1 is far from the theoretical optimal value for subexponentiality, but as a
degree it cannot be reduced. Still it is possible to keep in the basis only a small
proportion of the weight one reduced divisors. Fix #F = ± , where ± < q is to
be determined. A 1-smooth element has a probability 1/±g to be represented
over this reduced basis, therefore the expected number of steps before ¬nding
a relation is ±g g!. On the other hand, the number of required relations is
also reduced: we need only q/± + 1 columns in the matrix. Therefore the
expected number of steps in order to ¬ll in the matrix is qg!±g’1 . Also, the
linear algebra step is made easier by the fact that the factor basis is smaller,
the cost is O(q 2 /±2 ). Balancing the two phases leads to an optimal value for
± which is (q/g!)1/(g+1) . With this choice, we obtain an overall complexity of
(forgetting the coe¬cients which are polynomials in g and log q)
2g 2
O q g+1 · (g!) g+1 .

The term in g! has a negligible contribution, especially because g is supposed


. 5
( 10)