Lattice Chapters

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

PART IV

LATTICES
16
Lattices

The word “lattice” has two different meanings in mathematics. One meaning is related to
the theory of partial orderings on sets (for example, the lattice of subsets of a set). The other
meaning, which is the one relevant to us, is discrete subgroups of Rn .
There are several reasons for presenting lattices in this book. First, there are hard
computational problems on lattices that have been used as a building block for public key
cryptosystems (e.g., the Goldreich–Goldwasser–Halevi (GGH) cryptosystem, the NTRU
cryptosystem, the Ajtai–Dwork cryptosystem and the LWE cryptosystem); however, we do
not present these applications in this book. Second, lattices are used as a fundamental tool for
cryptanalysis of public key cryptosystems (e.g., lattice attacks on knapsack cryptosystems,
Coppersmith’s method for finding small solutions to polynomial equations, attacks on
signatures and attacks on variants of RSA). Third, there are applications of lattices to
efficient implementation of discrete logarithm systems (such as the GLV method; see
Section 11.3.3). Finally, lattices are used as a theoretical tool for security analysis of
cryptosystems, for example the bit security of Diffie–Hellman key exchange using the
hidden number problem (see Section 21.7) and the security proofs for RSA-OAEP.
Some good references for lattices, applications of lattices and/or lattice reduction algo-
rithms are: Cassels [114], Siegel [504], Cohen [127], von zur Gathen and Gerhard [220],
Grötschel, Lovász and Schrijver [245], Nguyen and Stern [414, 415], Micciancio and Gold-
wasser [378], Hoffstein, Pipher and Silverman [261], Lenstra’s chapter in [106], Micciancio
and Regev’s chapter in [48] and the proceedings of the conference LLL+25.

Notation used in this part


Z, Q, R Integers, rational, real numbers
b, v, w Row vectors (usually in Rm )
0 Zero vector in Rm
ei ith unit vector in Rm
In n × n identity matrix
x, x Inner product
x Euclidean length (2 norm)
 · a a -norm for a ∈ N

337
338 Lattices

span{v 1 , . . . , v n } Span of a set of vectors over R


rank(A) Rank of a matrix A
x Closest integer to x, 1/2 = 1
B Basis matrix for a lattice
L Lattice
b∗i Gram–Schmidt vector arising from ordered basis {b1 , . . . , bn}
µi,j Gram–Schmidt coefficient bi , b∗j /b∗j , b∗j 
Bi b∗i 2
λi Successive minima of a lattice
det(L) Determinant of a lattice
γn Hermite’s constant
X Bound on the size of the entries in the basis matrix L
B(i) i × m matrix formed by the first i rows of B
di Determinant of matrix of bj , bk  for 1 ≤ j, k ≤ i
D Product of di
P1/2 (B) Fundamental domain (parallelepiped) for lattice basis B
F (x), F (x, y) Polynomial with “small” root
G(x), G(x, y) Polynomial with “small” root in common with F (x) (resp.,
F (x, y))
X, Y Bounds on size of root in Coppersmith’s method
bF Coefficient vector of polynomial F
R(F, G), Rx (F (x), G(x)) Resultant of polynomials
W Bound in Coppersmith’s method
P, R Constants in noisy Chinese remaindering
amp(x) The amplitude gcd(P , x − R) in noisy Chinese remaindering

16.1 Basic notions on lattices


A lattice is a subset of the vector space Rm . We write all vectors as rows; be warned that
many books and papers write lattice vectors as columns. We denote by v the Euclidean
norm of a vector v ∈ Rm ; though some statements also hold for other norms.

Definition 16.1.1 Let {b1 , . . . , bn } be a linearly independent set of (row) vectors in Rm


(m ≥ n). The lattice generated by {b1 , . . . , bn } is the set
 n 

L= l i b i : li ∈ Z
i=1

of integer linear combinations of the bi . The vectors b1 , . . . , bn are called a lattice basis.
The lattice rank is n and the lattice dimension is m. If n = m then L is said to be a full
rank lattice.
16.1 Basic notions on lattices 339

Let L ⊂ Rm be a lattice. A sublattice is a subset L ⊂ L that is a lattice.


A basis matrix B of a lattice L is an n × m matrix formed by taking the rows to be
basis vectors bi . Thus Bi,j is the j th entry of the row bi and

L = {xB : x ∈ Zn }.

By assumption, the rows of a basis matrix are always linearly independent.


2 2
Example 16.1.2 Thelattice
 in R generated by {(1, 0), (0, 1)} is L = Z . The corresponding
1 0
basis matrix is B = 0 1 . Any 2 × 2 integer matrix B of determinant ±1 is also a basis
matrix for L.

We will mainly assume that the basis vectors bi for a lattice have integer entries. In
cryptographic applications, this is usually the case. We interchangeably use the words
points and vectors for elements of lattices. The vectors in a lattice form an Abelian group
under addition. When n ≥ 2 there are infinitely many choices for the basis of a lattice.
An alternative approach to lattices is to define L = Zn and to have a general length func-
tion q(v). One finds this approach in books on quadratic forms or optimisation problems,
e.g. Cassels [113] and Schrijver [478]. In particular, Section 6.2 of [478] presents the LLL
algorithm in the context of reducing the lattice L = Zn with respect to a length function
corresponding to a positive-definite rational matrix.
We now give an equivalent definition of lattice, which is suitable for some applications.
A subset L ⊆ Rm is called discrete if, for any real number r > 0, the set {v ∈ L : v ≤ r}
is finite. It is clear that a lattice is a subgroup of Rm that is discrete. The following result
shows the converse.

Lemma 16.1.3 Every discrete subgroup of Rm is a lattice.

Proof (Sketch) Let {v 1 , . . . , v n } be a linearly independent subset of L of maximal size.


The result is proved by induction. The case n = 1 is easy (since L is discrete there is
an element of minimal non-zero length). When n > 1 consider V = span{v 1 , . . . , v n−1 }
and set L = L ∩ V . By induction, L is a lattice and so has a basis b1 , . . . , bn−1 . The set

L ∩ { n−1
i=1 xi bi + xn v n : 0 ≤ xi < 1 for 1 ≤ i ≤ n − 1 and 0 < xn ≤ 1} is finite and so
has an element with smallest xn , call it bn . It can be shown that {b1 , . . . , bn } is a basis for
L. For full details see Theorem 6.1 of [527]. �

Exercise 16.1.4 Given an m × n integer matrix A show that ker(A) = {x ∈ Zm : xA = 0}


is a lattice. Show that the rank of the lattice is m − rank(A). Given an m × n integer matrix
A and an integer M show that {x ∈ Zm : xA ≡ 0 (mod M)} is a lattice of rank m.

In the case m > n it is sometimes convenient to project the lattice L into Rn using the
following construction. The motivation is that a linear map that preserves lengths preserves
volumes. Note that if the initial basis for L consists of vectors in Zn then the resulting basis
does not necessarily have this property.
340 Lattices

Lemma 16.1.5 Let B be an n × m basis matrix for a lattice L where m > n. Then there
is a linear map P : Rm → Rn such that P (L) is a rank n lattice and P (v) = v for all
v ∈ L. Furthermore, bi , bj  = bi P , bj P  for all 1 ≤ i < j ≤ n.
If the linear map is represented by an m × n matrix P then a basis matrix for the image
of L under the projection P is the n × n matrix BP , which is invertible.

Proof Given the n × m basis matrix B with rows bi , define V = span{b1 , . . . , bn } ⊂ Rm ,


which has dimension n by assumption. Choose (perhaps by running the Gram–Schmidt
algorithm) a basis v 1 , . . . , v n for V that is orthonormal with respect to the inner product
in Rm . Define the linear map P : V → R# n
by P (v i ) = ei and P (V ⊥ ) = {0}. For v =
n n 2
i=1 xi v i ∈ V we have v = v, v = i=1 xi = vP . Since the vectors bi form
a basis for V , the vectors bi P are linearly independent. Hence, BP is an invertible matrix
and P (L) is a lattice of rank n. �

We can now prove the following fundamental result.

Lemma 16.1.6 Two n × m matrices B and B  generate the same lattice L if and only if
B and B  are related by a unimodular matrix, i.e. B  = U B where U is an n × n matrix
with integer entries and determinant ±1.

Proof (⇒) Every row of B  is an integer linear combination


n

bi = ui,j bj
j =1

of the rows in B. This can be represented as B  = U B for an n × n integer matrix U .


Similarly, B = U  B  = U  U B. Now applying the projection P of Lemma 16.1.5 we have
BP = U  U BP and, since BP is invertible, U  U = In (the identity matrix). Since U and U 
have integer entries it follows that det(U ), det(U  ) ∈ Z. From det(U ) det(U  ) = det(In ) = 1
it follows that det(U ) = ±1.
(⇐) Since U is a permutation of Zn we have {xB  : x ∈ Zn } = {xB : x ∈ Zn }. �

The Hermite normal form is defined in Section A.11. The following result is a direct
consequence of Lemma 16.1.6 and the remarks in Section A.11.

Lemma 16.1.7 If B is the basis matrix of a lattice L then the Hermite normal form of B
is also a basis matrix for L.

The determinant of a lattice L is the volume of the fundamental parallelepiped of any


basis B for L. When the lattice has full rank then using Definition A.10.7 and Lemma A.10.8
we have det(L) = | det(B)|. For the case n < m our definition uses Lemma 16.1.5.

Definition 16.1.8 Let the notation be as above. The determinant (or volume) of a lattice
L with basis matrix B is | det(BP )|, where P is a matrix representing the projection of
Lemma 16.1.5.
16.1 Basic notions on lattices 341

Lemma 16.1.9 The determinant of a lattice is independent of the choice of basis matrix B
and the choice of projection P .

Proof Let P and P  be two projection matrices corresponding to orthogonal bases


{v 1 , . . . , v n } and {v 1 , . . . , v n } for V = span{b1 , . . . , bn }. Then, by Lemma A.10.3, P  =
P W for some orthogonal matrix W (hence, det(W ) = ±1). It follows that | det(BP )| does
not depend on the choice of P .
Let B and B  be two basis matrices for a lattice L. Then B  = U B where U is an n × n
matrix such that det(U ) = ±1. Then det(L) = | det(BP )| = | det(U BP )| = | det(B  P )|.

We have seen that there are many different choices of basis for a given lattice L. A
fundamental problem is to compute a “nice” lattice basis for L; specifically one where the
vectors are relatively short and close to orthogonal. The following exercise shows that these
properties are intertwined.
Exercise 16.1.10 Let L be a rank 2 lattice in R2 and let {b1 , b2 } be a basis for L.

1. Show that

det(L) = b1 b2 | sin(θ )| (16.1)

where θ is the angle between b1 and b2 .


2. Hence, deduce that the product b1 b2  is minimised over all choices {b1 , b2 } of basis
for L when the angle θ is closest to ±π/2.
Definition 16.1.11 Let L be a lattice in Rm of rank n with basis matrix B. The Gram
matrix of B is BB T . This is an n × n matrix whose (i, j )th entry is bi , bj .
Lemma 16.1.12 Let L be a lattice in Rm of rank n with basis matrix B. Then det(L) =

det(BB T ).
2
Proof Consider first the case m = n. Then det(L) = det(B) det(B T ) = det(BB T ) =
det((bi , bj )i,j ). Hence, when m > n, det(L) = det(B  (B  )T ). Now, the (i, j )th entry
of B  (B  )T = (BP )(BP )T is bi P , bj P , which is equal to the (i, j )th entry of BB T by
Lemma 16.1.5. �
Note that an integer lattice of non-full rank may not have integer determinant.

Exercise 16.1.13 Find an example of a lattice of rank 1 in Z2 whose determinant is not an


integer.

Lemma 16.1.14 Let b1 , . . . , bn be an ordered basis for a lattice L in Rm and let b∗1 , . . . , b∗n

be the Gram–Schmidt orthogonalisation. Then det(L) = ni=1 b∗i .

Proof The case m = n is already proved in Lemma A.10.8. For the general case let
v i = b∗i /b∗i  be the orthonormal basis required for the construction of the projection P .
Then P (b∗i ) = b∗i ei . Write B and B ∗ for the n × m matrices formed by the rows bi and
342 Lattices

b∗i respectively. It follows that B ∗ P is an n × n diagonal matrix with diagonal entries b∗i .
Finally, by the Gram–Schmidt construction, B ∗ = U B for some n × n matrix U such that
det(U ) = 1. Combining these facts gives1

n

det(L) = | det(BP )| = | det(U BP )| = | det(B ∗ P )| = b∗i . �
i=1

Exercise 16.1.15 Let {b1 , . . . , bn } be an ordered lattice basis in Rm and let {b∗1 , . . . , b∗n }
be the Gram–Schmidt orthogonalisation. Show that bi  ≥ b∗i  and hence det(L) ≤
n
i=1 bi .

Definition 16.1.16 Let {b1 , . . . , bn } be a basis for a lattice L in Rm . The orthogonality


defect of the basis is
 n 

bi  / det(L).
i=1

Exercise 16.1.17 Show that the orthogonality defect of {b1 , . . . , bn } is 1 if and only if the
basis is orthogonal.
Definition 16.1.18 Let L ⊂ Rm be a lattice of rank n. The successive minima of L are
λ1 , . . . , λn ∈ R such that, for 1 ≤ i ≤ n, λi is minimal such that there exist i linearly
independent vectors v 1 , . . . , v i ∈ L with v j  ≤ λi for 1 ≤ j ≤ i.
It follows that 0 < λ1 ≤ λ2 · · · ≤ λn . In general, there is not a basis consisting of vectors
whose lengths are equal to the successive minima, as the following example shows.
Example 16.1.19 Let L ⊂ Zn be the set
L = {(x1 , . . . , xn ) : x1 ≡ x2 ≡ · · · ≡ xn (mod 2)}.
It is easy to check that this is a lattice. The vectors 2ei ∈ L for 1 ≤ i ≤ n are linearly
independent and have length 2. Every other vector x ∈ L with even entries has length ≥ 2.

Every vector x ∈ L with odd entries has all xi
= 0√and so x ≥ n.
√ minima are λ1 = λ2 = 2 and if n = 3 the successive minima
If n = 2 the successive
are λ1 = λ2 = λ3 = 3. When n ≥ 4 then λ1 = λ2 = · · · = λn = 2. For n ≤ 4 one can
construct a basis for the lattice with vectors of lengths equal to the successive minima.
When n > 4 there is no basis for L consisting of vectors of length 2, since a basis must
contain at least one vector having odd entries.
Exercise 16.1.20 For n = 2, 3, 4 in Example 16.1.19 write down a basis for the lattice
consisting of vectors of lengths equal to the successive minima.
Exercise 16.1.21 For n > 4 in Example 16.1.19 show there is a basis for the lattice such

that bi  = λi for 1 ≤ i < n and bn  = n.
1
The formula BP = U −1 (B ∗ P ) is the QR decomposition of BP .
16.2 The Hermite and Minkowski bounds 343

Definition 16.1.22 Let L ⊆ Rm be a lattice and write V ⊆ Rm for the R-vector space
spanned by the vectors in L. The dual lattice of L is L∗ = {y ∈ V : x, y ∈ Z for all
x ∈ L}.
Exercise 16.1.23 Show that the dual lattice is a lattice. Let B be a basis matrix of a full
rank lattice L. Show that (B T )−1 is a basis matrix for the dual lattice. Hence, show that the
determinant of the dual lattice is det(L)−1 .

16.2 The Hermite and Minkowski bounds


We state the following results without rigorously defining the term volume and without giv-
ing proofs (see Section 1.3 of Micciancio and Goldwasser [378], Chapter 1 of Siegel [504],
Chapter 6 of Hoffstein, Pipher and Silverman [261] or Chapter 12 of Cassels [113] for
details).
Theorem 16.2.1 (Blichfeldt) Let L be a lattice in Rm with basis {b1 , . . . , bn } and S any
measurable set such that S ⊂ span{bi : 1 ≤ i ≤ n}. If the volume of S exceeds det(L) then
there exist two distinct points v 1 , v 2 ∈ S such that (v 1 − v 2 ) ∈ L.
Proof See Theorem 1.3 of [378] or Section III.2.1 of [113]. �
Theorem 16.2.2 (Minkowski Convex body theorem) Let L be a lattice in Rm with basis
{b1 , . . . , bn } and let S be any convex set such that S ⊂ span{bi : 1 ≤ i ≤ n}, 0 ∈ S and if
v ∈ S then −v ∈ S. If the volume of S is > 2n det(L) then there exists a non-zero lattice
point v ∈ S ∩ L.
Proof See Section III.2.2 of Cassels [113], Theorem 6.28 of Hoffstein, Pipher and Silver-
man [261], Theorem 1.4 of Micciancio and Goldwasser [378], or Theorem 6.1 of Stewart
and Tall [527]. �
The convex body theorem is used to prove Theorem 16.2.3. The intuition behind this
result is that if the shortest non-zero vector in a lattice is large then the volume of the lattice
cannot be small.
Theorem 16.2.3 Let n ∈ N. There is a constant 0 < γn ≤ n such that, for any lattice L of
rank n in Rn having first minimum λ1 (for the Euclidean norm),
λ21 < γn det(L)2/n .
Proof See Theorem 1.5 of [378], Theorem 6.25 of [261], or Theorem 12.2.1 of [113]. �
Exercise 16.2.4 Show that the convex body theorem is tight. In other words, find a lattice
L in Rn for some n and a symmetric convex subset S ⊆ Rn such that the volume of S is
2n det(L) and yet S ∩ L = {0}.
Exercise 16.2.5 Show that, with respect to the ∞ norm, λ1 ≤ det(L)1/n . Show that, with
respect to the 1 norm, λ1 ≤ (n! det(L))1/n ≈ n det(L)1/n /e.
344 Lattices

Exercise 16.2.6 � Let√ a, b ∈ N. Show that there is a solution r, s, t ∈ Z to r = as + bt


such that s 2 + r 2 ≤ 2b.

Definition 16.2.7 Let n ∈ N. The smallest real number γn such that

λ21 ≤ γn det(L)2/n

for all lattices L of rank n is called the Hermite constant.



Exercise 16.2.8 This exercise is to show that γ2 = 2/ 3.

1. Let {b1 , b2 } be a Gauss reduced basis (see Definition 17.1.1 of the next section)
for a dimension 2 lattice in R2 . Define the quadratic form N (x, y) = xb1 + yb2 2 .
Show that, without loss of generality, N (x, y) = ax 2 + 2bxy + cy 2 with a, b, c ≥ 0
and a ≤ c.
2. Using N (1, −1) ≥ N (0, 1) (which follows from the property of being Gauss reduced),
show that 2b ≤ a. Hence, show that 3ac ≤ 4(ac − b2 )
2 2
√ that det(L) = |b − ac|. Hence, deduce that Hermite’s constant satisfies γ2 ≤
3. Show
2/ 3. √
4. Show
√ that the lattice L ⊂ R 2
with basis {(1, 0), (−1/2, 3/2)} satisfies λ21 =
(2/ 3) det(L). √
(Optional) Show that L is equal to the ring of algebraic integers of Q( −3). Show
that centering balls of radius 1/2 at each point of L gives the most dense lattice packing
of balls in R2 .

Section 6.5.2 of Nguyen [409] lists the first 8 values of γn , gives the bound n
2π e
+ o(1) ≤
n
γn ≤ πe (1 + o(1)) and gives further references.

Theorem 16.2.9 (Minkowski) Let L be a lattice of rank n in Rn with successive minima


λ1 , . . . , λn for the Euclidean norm. Then
 n 1/n

λi < n det(L)1/n .
i=1

√ √
Proof See Theorem 12.2.2 of [113]. (The term n can be replaced by γn .) �

The Gaussian heuristic states that the shortest non-zero vector in a “random” lattice L
of dimension n in Rn is expected to have length approximately
"
n
det(L)1/n .
2π e

We refer to Section 6.5.3 of [409] and Section 6.5.3 of [261] for discussion and references.
16.3 Computational problems in lattices 345

16.3 Computational problems in lattices


There are several natural computational problems relating to lattices. We start by listing
some problems that can be efficiently solved using linear algebra (in particular, the Hermite
normal form).
1. Lattice membership: Given an n × m basis matrix B for a lattice L ⊆ Zm and a vector
v ∈ Zm determine whether v ∈ L.
2. Lattice basis: Given a set of vectors b1 , . . . , bn in Zm (possibly linearly dependent) find
a basis for the lattice generated by them.
3. Kernel lattice: Given an m × n integer matrix A compute a basis for the lattice ker(A) =
{x ∈ Zm : xA = 0}.
4. Kernel lattice modulo M: Given an m × n integer matrix A and an integer M compute
a basis for the lattice {x ∈ Zm : xA ≡ 0 (mod M)}.
Exercise 16.3.1� Describe explicit algorithms for the above problems and determine their
complexity.
Now we list some computational problems that seem to be hard in general.
Definition 16.3.2 Let L be a lattice in Zm .
1. The shortest vector problem (SVP) is the computational problem: given a basis matrix
B for L compute a non-zero vector v ∈ L such that v is minimal (i.e., v = λ1 ).
2. The closest vector problem (CVP) is the computational problem: given a basis matrix
B for L and a vector w ∈ Qm (one can work with high-precision approximations in Rm ,
but this is essentially still working in Qm ) compute v ∈ L such that w − v is minimal.
3. The decision closest vector problem (DCVP) is: given a basis matrix B for a lattice L,
a vector w ∈ Qm and a real number r > 0 decide whether or not there is a vector v ∈ L
such that w − v ≤ r.
4. The decision shortest vector problem is: given a basis matrix B for a lattice L and a
real number r > 0 decide whether or not there is a non-zero v ∈ L such that v ≤ r.
5. Fix γ > 1. The approximate SVP is: given a basis matrix B for L compute a non-zero
vector v ∈ L such that v ≤ γ λ1 .
6. Fix γ > 1. The approximate CVP is: given a basis matrix B for L and a vector w ∈ Qm
compute v ∈ L such that w − v ≤ γ w − xB for all x ∈ Zn .
In general, these computational problems are known to be hard2 when the rank is
sufficiently large. It is known that CVP is NP-hard (this is shown by relating CVP with
subset-sum; for details see Chapter 3 of [378]). Also, SVP is NP-hard under randomised
reductions and non-uniform reductions (see Chapter 4 of [378] for an explanation of these
terms and proofs). Nguyen [409] gives a summary of the complexity results and current
best running times of algorithms for these problems.
On the other hand, if a lattice is sufficiently nice then these problems may be easy.
2
We do not give details of complexity theory in this book; in particular, we do not define the term “NP-hard”.
346 Lattices

Example 16.3.3 Let L ⊂ R2 be the lattice with basis matrix


 
1001 0
B= .
0 2008
Then every lattice vector is of the form (1001a, 2008b) where a, b ∈ Z. Hence the shortest
non-zero vectors are clearly (1001, 0) and (−1001, 0). Similarly, the closest vector to
w = (5432, 6000) is clearly (5005, 6024).
Why is this example so easy? The reason is that the basis vectors are orthogonal. Even
in large dimensions, the SVP and CVP problems are easy if one has an orthogonal basis for
a lattice. When given a basis that is not orthogonal it is less obvious whether there exists
a non-trivial linear combination of the basis vectors that give a vector strictly shorter than
the shortest basis vector. A basis for a lattice that is “as close to orthogonal as it can be” is
therefore convenient for solving some computational problems.
17
Lattice basis reduction

The goal of lattice basis reduction is to transform a given lattice basis into a “nice” lattice
basis consisting of vectors that are short and close to orthogonal. To achieve this, one
needs both a suitable mathematical definition of “nice basis” and an efficient algorithm to
compute a basis satisfying this definition.
Reduction of lattice bases of rank 2 in R2 was given by Lagrange1 and Gauss. The
algorithm is closely related to Euclid’s algorithm and we briefly present it in Section 17.1.
The main goal of this section is to present the lattice basis reduction algorithm of Lenstra,
Lenstra and Lovász, known as the LLL or L3 algorithm. This is a very important algorithm
for practical applications. Some basic references for the LLL algorithm are Section 14.3 of
Smart [513], Section 2.6 of Cohen [127] and Chapter 17 of Trappe and Washington [547].
More detailed treatments are given in von zur Gathen and Gerhard [220], Grötschel, Lovász
and Schrijver [245], Section 1.2 of Lovász [356], and Nguyen and Vallée [416]. I also highly
recommend the original paper [335].
The LLL algorithm generalises the Lagrange–Gauss algorithm and exploits the Gram–
Schmidt orthogonalisation. Note that the Gram–Schmidt process is not useful, in general,
for lattices since the coefficients µi,j do not usually lie in Z and so the resulting vectors are
not usually elements of the lattice. The LLL algorithm uses the Gram–Schmidt vectors to
determine the quality of the lattice basis, but ensures that the linear combinations used to
update the lattice vectors are all over Z.

17.1 Lattice basis reduction in two dimensions


2
Let b1 , b2 ∈ R be linear independent vectors and denote by L the lattice for which they are
a basis. The goal is to output a basis for the lattice such that the lengths of the basis vectors
are as short as possible (in this case, successive minima). Lagrange and Gauss gave the
following criteria for a basis to be reduced and then developed Algorithm 22 to compute
such a basis.

1
The algorithm was first written down by Lagrange and later by Gauss, but is usually called the “Gauss algorithm”. We refer to
[408] for the original references.

347
348 Lattice basis reduction

Definition 17.1.1 An ordered basis b1 , b2 for R2 is Lagrange–Gauss reduced if b1  ≤


b2  ≤ b2 + qb1  for all q ∈ Z.
The following theorem shows that the vectors in a Lagrange–Gauss reduced basis are as
short as possible. This result holds for any norm, though the algorithm presented below is
only for the Euclidean norm.
Theorem 17.1.2 Let λ1 , λ2 be the successive minima of L. If L has an ordered basis
{b1 , b2 } that is Lagrange–Gauss reduced then bi  = λi for i = 1, 2.
Proof By definition we have
b2 + qb1  ≥ b2  ≥ b1 
for all q ∈ Z.
Let v = l1 b1 + l2 b2 be any non-zero point in L. If l2 = 0 then v ≥ b1 . If l2
= 0
then write l1 = ql2 + r with q, r ∈ Z such that 0 ≤ r < |l2 |. Then v = rb1 + l2 (b2 + qb1 )
and, by the triangle inequality
$ $
v ≥ |l2 | $b2 + qb1 $ − rb1 
= (|l2 | − r)b2 + qb1  + r(b2 + qb1  − b1 )
≥ b2 + qb1  ≥ b2  ≥ b1 .
This completes the proof. �
Definition 17.1.3 Let b1 , . . . , bn be a list of vectors in Rn . We write2 Bi = bi 2 = bi , bi .
A crucial ingredient for the Lagrange–Gauss algorithm is that
b2 − µb1 2 = B2 − 2µb1 , b2  + µ2 B1 (17.1)
is minimised at µ = b1 , b2 /B1 (to see this, note that the graph as a function of µ is a
parabola and that the minimum can be found by differentiating with respect to µ). Since
we are working in a lattice we therefore replace b2 by b2 − µ b1 where µ is the nearest
integer to µ. Hence, lines 3 and 9 of Algorithm 22 reduce the size of b2 as much as possible
using b1 . In the one-dimensional case, the formula b2 − µ b1 is the familiar operation
ri+1 = ri−1 − ri−1 /ri ri from Euclid’s algorithm.
Lemma 17.1.4 An ordered basis {b1 , b2 } is Lagrange–Gauss reduced if and only if
b1  ≤ b2  ≤ b2 ± b1 .
Proof The forward implication is trivial. For the converse, suppose b2  ≤ b2 ± b1 .
We use the fact that the graph of F (µ) = b2 + µb1 2 is a parabola. It follows that the
miniumum of F (µ) is taken for −1 < µ < 1. Hence, b2  ≤ b2 + qb1  for q ∈ Z such
that |q| > 1. �

2
The reader is warned that the notation Bi will have a different meaning when we are discussing the LLL algorithm.
17.1 Lattice basis reduction in two dimensions 349

Algorithm 22 gives the Lagrange–Gauss algorithm for lattices in Z2 . Note that the
computation of µ is as an exact value in Q. All other arithmetic is exact integer arithmetic.

Lemma 17.1.5 Algorithm 22 terminates and outputs a Lagrange–Gauss reduced basis for
the lattice L.

Exercise 17.1.6 Prove Lemma 17.1.5.

Example 17.1.7 We run the Lagrange–Gauss algorithm on b1 = (1, 5) and b2 = (6, 21).
In the first step, µ = 111/26 ≈ 4.27 and so we update b2 = b2 − 4b1 = (2, 1). We then
swap b1 and b2 so that the values in the loop are now b1 = (2, 1) and b2 = (1, 5). This time,
µ = 7/5 = 1.4 and so we set b2 = b2 − b1 = (−1, 4). Since b2  > b1  the algorithm
halts and outputs {(2, 1), (−1, 4)}.

Algorithm 22 Lagrange–Gauss lattice basis reduction


Input: Basis b1 , b2 ∈ Z2 for a lattice L
Output: Basis (b1 , b2 ) for L such that bi  = λi
2
1: B1 = b1 
2: µ = b1 , b2 /B1
3: b2 = b2 − µ b1
2
4: B2 = b2 
5: while B2 < B1 do
6: Swap b1 and b2
7: B 1 = B2
8: µ = b1 , b2 /B1
9: b2 = b2 − µ b1
10: B2 = b2 2
11: end while
12: return (b1 , b2 )

Exercise 17.1.8 Run the Lagrange–Gauss reduction algorithm on the basis {(3, 8), (5, 14)}.

Lemma 17.1.9 Let b1 , b2 be the initial vectors in an iteration of the Lagrange–Gauss


algorithm and suppose b1 = b2 − mb1 and b2 = b1 are the vectors that will be considered
in the next step of the algorithm. Then b1 2 < b1 2 /3, except perhaps for the last two
iterations.

Proof Note that m = µ = b1 , b2 /b1 , b1  = b1 , b2 /b1 , b1  +  where || ≤ 1/2.
Hence
 
b1 , b1  = b1 , b2 − b1 , b2 /b1 , b1  +  b1  = −b1 , b1  = −b1 2 .
350 Lattice basis reduction

We show that b1 2 < b1 2 /3 unless we are in the last two iterations of the algorithm.
To do this, suppose that b1 2 ≥ b1 2 /3. Then

|b1 , b2 | = |b1 , b1 | = ||b1 2 ≤ 12 b1 2 ≤ 32 b1 2 .

It follows that, in the next iteration of the algorithm, we will be taking m = µ ∈ {−1, 0, 1}
and so the next iteration would, at most, replace b1 with b2 ± b1 = b1 ± (b2 − mb1 ). But,
if this were smaller than b1 then we would have already computed b1 differently in the
current iteration. Hence, the next step is the final iteration. �

Theorem 17.1.10 Let X ∈ Z≥2 and let b1 , b2 be vectors in Z2 such that bi 2 ≤ X. Then
the Lagrange–Gauss algorithm performs O(log(X)3 ) bit operations.

Proof Lemma 17.1.9 shows that there are O(log(X)) iterations in the Lagrange–Gauss
algorithm. Since the squared Euclidean lengths of all vectors √ in the algorithm are bounded
by X, it follows that entries of vectors are integers bounded by X. Similarly, the numerator
and denominator of µ ∈ Q require O(log(X)) bits. The result follows. �

A much more precise analysis of the Lagrange–Gauss reduction algorithm is given by


Vallée [551]. Indeed, the algorithm has complexity O(log(X)2 ) bit operations; see Nguyen
and Stehlé [408].
The above discussion is for the Euclidean norm, but the Lagrange–Gauss reduction
algorithm can be performed for any norm (the only change is how one computes µ). We
refer to Kaib and Schnorr [292] for analysis and details.
Finally, we remark that there is a natural analogue of Definition 17.1.1 for any dimension.
Hence, it is natural to try to generalise the Lagrange–Gauss algorithm to higher dimensions.
Generalisations to dimension three have been given by Vallée [550] and Semaev [485].
There are a number of problems when generalising to higher dimensions. For example,
choosing the right linear combination to size reduce bn using b1 , . . . , bn−1 is solving the
CVP in a sublattice (which is a hard problem). Furthermore, there is no guarantee that the
resulting basis actually has good properties in high dimension. We refer to Nguyen and
Stehlé [413] for a full discussion of these issues and an algorithm that works in dimensions
3 and 4.

17.1.1 Connection between Lagrange–Gauss reduction and Euclid’s algorithm


The Lagrange–Gauss algorithm is closely related to Euclid’s algorithm. We briefly discuss
some similarities and differences. Recall that if a, b ∈ Z then Euclid’s algorithm (using
signed remainders) produces a sequence of integers ri , si , ti such that

asi + bti = ri

where |ri ti | < |a| and |ri si | < |b|. The precise formulae are ri+1 = ri−1 − qri and si+1 =
si−1 − qsi where q = ri−1 /ri . The sequence |ri | is strictly decreasing. The initial values
are r−1 = a, r0 = b, s−1 = 1, s0 = 0, t−1 = 0, t0 = 1. In other words, the lattice with basis
17.1 Lattice basis reduction in two dimensions 351

matrix
   
0 b s0 r0
B= =
1 a s−1 r−1
contains the vectors

(si , ri ) = (ti , si )B.

These vectors are typically shorter than the original vectors of the lattice.
We claim that if si is sufficiently small compared with ri then one step of the Lagrange–
Gauss algorithm on B corresponds to one step of Euclid’s algorithm (with negative
remainders).
To see this, let b1 = (si , ri ) and consider the Lagrange–Gauss algorithm with b2 =
(si−1 , ri−1 ). First compute the value
b1 , b2  si si−1 + ri ri−1
µ= = .
b1 , b1  si2 + ri2
If si is sufficiently small relative to ri (e.g., in the first step, when s0 = 0) then

µ = ri ri−1 /ri2 = ri−1 /ri = q.

Hence, the operation v = b2 − µ b1 is v = (si−1 − qsi , ri−1 − qri ), which agrees with
Euclid’s algorithm. Finally, the Lagrange–Gauss algorithm compares the lengths of the
vectors v and b1 to see if they should be swapped. When si+1 is small compared with ri+1
then v is smaller than b1 . Hence, the vectors are swapped and the matrix becomes
 
si−1 − qsi ri−1 − qri
si ri
just as in Euclid’s algorithm.
The algorithms start to deviate once si become large (this can already happen on the
second iteration, as the below example shows). Further, Euclid’s algorithm runs until ri = 0
(in which case si ≈ b) whereas Lagrange–Gauss reduction stops when ri ≈ si .

Example 17.1.11 Let a = 19 and b = 8. The sequence of remainders in the signed


Euclidean algorithm is 3, −1 while the Lagrange–Gauss lattice basis reduction algorithm
computes remainders 3, 2.

Example 17.1.12 Consider a = 8239876 and b = 1020301, which have gcd equal to one.
Let
 
0 b
B= .
1 a
Running the Lagrange–Gauss algorithm on this matrix gives
 
540 379
.
619 −1455
352 Lattice basis reduction

One can verify that

379 = 540a + t4 b where t4 = −4361

and

−1455 = 619a + t5 b where t5 = −4999.

17.2 LLL-reduced lattice bases


This section presents the crucial definition from [335] and some of its consequences. The
main result is Theorem 17.2.12, which shows that an LLL-reduced lattice basis does have
good properties.
Recall first that if b1 , . . . , bn is a set of vectors in Rm then one can define the Gram–
Schmidt orthogonalisation b∗1 , . . . , b∗n as in Section A.10.2. We use the notation µi,j =
bi , b∗j /b∗j , b∗j  throughout.
As we have noted in Example 16.3.3, computational problems in lattices can be easy
if one has a basis that is orthogonal or “sufficiently close to orthogonal”. A simple but
important observation is that one can determine when a basis is close to orthogonal by
considering the lengths of the Gram–Schmidt vectors. More precisely, a lattice basis is
“close to orthogonal” if the lengths of the Gram–Schmidt vectors do not decrease too
rapidly.

Example 17.2.1 Two bases for Z2 are {(1, 0), (0, 1)} and {(23, 24), (24, 25)}. In the first
case, the Gram–Schmidt vectors both have length 1. In the second case, the Gram–Schmidt

vectors are b∗1√= (23, 24) and b∗2 = (24/1105, −23/1105), which have lengths 1105 ≈
33.24 and 1/ 1105 ≈ 0.03 respectively. The fact that the lengths of the Gram–Schmidt
vectors dramatically decreases reveals that the original basis is not of good quality.

We now list some easy properties of the Gram–Schmidt orthogonalisation.

Lemma 17.2.2 Let {b1 , . . . , bn } be linearly independent in Rm and let {b∗1 , . . . , b∗n } be the
Gram–Schmidt orthogonalisation.

1. b∗i  ≤ bi  for 1 ≤ i ≤ n.


2. bi , b∗i  = b∗i , b∗i  for 1 ≤ i ≤ n.
3. Denote the closest integer to µk,j by µk,j . If bk = bk − µk,j bj for 1 ≤ k ≤ n and
1 ≤ j < k and if µk,j = bk , b∗j /b∗j , b∗j  then |µk,j | ≤ 1/2.

Exercise 17.2.3 Prove Lemma 17.2.2.

Definition 17.2.4 Let {b1 , . . . , bn } be an ordered basis for a lattice. Denote by {b∗1 , . . . , b∗n }
the Gram–Schmidt orthogonalisation and write Bi = b∗i 2 = b∗i , b∗i . Let

µi,j = bi , b∗j /b∗j , b∗j 


17.2 LLL-reduced lattice bases 353

for 1 ≤ j < i ≤ n be the coefficients from the Gram–Schmidt process. Fix 1/4 < δ < 1.
The (ordered) basis is LLL-reduced (with factor δ) if the following conditions hold:

� (Size reduced) |µi,j | ≤ 1/2 for 1 ≤ j < i ≤ n.


� (Lovász condition)
 
Bi ≥ δ − µ2i,i−1 Bi−1

for 2 ≤ i ≤ n.

It is traditional to choose δ = 3/4 in the Lovász condition.

Exercise 17.2.5 Which of the following basis matrices represents an LLL-reduced basis
(with δ = 3/4)?
         
1 0 0 4 1 −2 5 0 10 0
, , , , .
0 2 1 0 3 1 0 4 0 9

Exercise 17.2.6 Prove that an equivalent formulation (more in the flavour of the Lagrange–
Gauss method) of the Lovász condition is

Bi + µ2i,i−1 Bi−1 = b∗i + µi,i−1 b∗i−1 2 ≥ δBi−1 .

Exercise 17.2.7 Find an ordered basis {b1 , b2 } in R2 that is LLL-reduced, but has the
property that b2  < b1  and that the ordered basis {b2 , b1 } is not LLL-reduced.

For the moment we do not concern ourselves with the question of whether an LLL-
reduced basis can exist for every lattice L. In Section 17.4 we will present the LLL
algorithm, which constructs such a basis for any lattice (hence, giving a constructive
existence proof for an LLL-reduced basis).
The following properties of an LLL-reduced basis hold.

Lemma 17.2.8 Let {b1 , . . . , bn } be an LLL-reduced basis with δ = 3/4 for a lattice L ⊂
Rm . Let the notation be as above. In particular, b is the Euclidean norm.

1. Bj ≤ 2i−j Bi for 1 ≤ j ≤ i ≤ n.
2. Bi ≤ bi 2 ≤ ( 12 + 2i−2 )Bi for 1 ≤ i ≤ n.
3. bj  ≤ 2(i−1)/2 b∗i  for 1 ≤ j ≤ i ≤ n.

Proof

1. The Lovász condition implies Bi ≥ ( 43 − 14 )Bi−1 = 12 Bi−1 and the result follows by
induction.
354 Lattice basis reduction
i−1
2. From bi = b∗i + j =1 µi,j b∗j we have

bi 2 = bi , bi 
% i−1 i−1
&
 
∗ ∗ ∗ ∗
= bi + µi,j bj , bi + µi,j bj
j =1 j =1
i−1

= Bi + µ2i,j Bj ,
j =1


which is clearly ≥ Bi . By part 1 this is at most Bi (1 + 14 i−1
j =1 2
i−j
) = Bi (1 + 14 (2i −
2)) = Bi ( 21 + 2i−2 ).
3. Since j ≥ 1 we have 12 + 2j −2 ≤ 2j −1 . Part 2 can therefore be written as bj 2 ≤
2j −1 Bj . By part 1, Bj ≤ 2i−j Bi and so bj 2 ≤ 2j −1 2i−j Bi = 2i−1 Bi . Taking square
roots gives the result.

We now give the same result for a slightly different value of δ.



Lemma 17.2.9 Let {b1 , . . . , bn } be an LLL-reduced basis with δ = 1/4 + 1/ 2 ≈ 0.957
for a lattice L ⊂ Rm . Let the notation be as above. In particular, b is the Euclidean
norm:

1. Bj ≤ 2(i−j )/2 Bi for 1 ≤ j ≤ i ≤ n.


2. Bi ≤ bi 2 ≤ ( 16 + 2(i−1)/2 )Bi for 1 ≤ i ≤ n.
3. bj  ≤ 2i/4 b∗i  for 1 ≤ j ≤ i ≤ n.

Exercise 17.2.10� Prove Lemma 17.2.9.

Lemma 17.2.11 Let {b1 , . . . , bn } be an ordered basis for a lattice L ⊂ Rm and let
{b∗1 , . . . , b∗n } be the Gram–Schmidt orthogonalisation. Let λ1 be the length of the shortest
non-zero vector in the lattice. Then

λ1 ≥ min b∗i .
1≤i≤n

Furthermore, let w1 , . . . , w i ∈ L be linearly independent lattice vectors such that


max{w 1 , . . . , w i } = λi , as in the definition of successive minima. Write wj =
n
k=1 zj,k bk . For 1 ≤ j ≤ i denote by k(j ) the largest value for k such that 1 ≤ k ≤ n
and zj,k
= 0. Then wj  ≥ b∗k(j ) .

Proof Let x = (x1 , . . . , xn ) ∈ Zn be arbitrary such that x


= 0. Let i be the largest index
such that xi
= 0. We will show that xB ≥ bi∗ , from which the result follows.

We have xB = ij =1 xj bj . Since b∗i is orthogonal to the span of {b1 , . . . , bi−1 } we have
xB, b∗i  = xi b∗i , b∗i  = xi b∗i 2 . Since xi ∈ Z and xi
= 0 it follows that |xB, b∗i | ≥
17.2 LLL-reduced lattice bases 355

b∗i 2 . By part 4 of Lemma A.10.3 it follows that

xB ≥ b∗i ,

which completes the proof. �

Theorem 17.2.12 shows that an LLL-reduced lattice basis has good properties. In par-
ticular, the first vector of an LLL-reduced lattice basis has length at most 2(n−1)/2 times the
length of a shortest non-zero vector.

Theorem 17.2.12 Let {b1 , . . . , bn } be an LLL-reduced basis with δ = 3/4 for a lattice
L ⊂ Rm . Let the notation be as above. In particular, b is the Euclidean norm.

1. b1  ≤ 2(n−1)/2 λ1 .
2. bj  ≤ 2(n−1)/2 λi for 1 ≤ j ≤ i ≤ n. (This may look strange, but it tends to be used for
fixed i and varying j , rather than the other way around.)
3. 2(1−i)/2 λi ≤ bi  ≤ 2(n−1)/2 λi .

4. det(L) ≤ ni=1 bi  ≤ 2n(n−1)/4 det(L).
5. b1  ≤ 2(n−1)/4 det(L)1/n .

Proof

1. From part 1 of Lemma 17.2.8 we have bi∗  ≥ 2(i−1)/2 b1∗ . Hence, part 1 implies

λ1 ≥ min b∗i 
1≤i≤n

≥ min 2(1−i)/2 b∗1 


1≤i≤n

= 2(1−n)/2 b∗1 .

The result follows since b∗1 = b1 .


2. Let w 1 , . . . , w i ∈ L be linearly independent lattice vectors such that max{w 1 , . . . ,
wi } = λi . Let k(j ) be defined as in Lemma 17.2.11 so that w j  ≥ b∗k(j ) .
Renumber the vectors w j so that k(1) ≤ k(2) ≤ · · · ≤ k(i). We claim that j ≤ k(j ).
If not then w1 , . . . , w j would belong to the span of {b1 , . . . , bj −1 } and would be linearly
dependent.
Finally

bj  ≤ 2(k(j )−1)/2 b∗k(j )  ≤ 2(n−1)/2 wj  ≤ 2(n−1)/2 λi ,

which proves the result.


3. The upper bound on bi  is given by part 2.
Since {b1 , . . . , bi } are linearly independent we have λi ≤ max1≤j ≤i bj  and by part
3 of Lemma 17.2.8 each bj  ≤ 2(i−1)/2 b∗i . Using b∗i  ≤ bi  we obtain the lower
bound on bi .

4. By Lemma 16.1.14 we have det(L) = ni=1 b∗i . The result follows from b∗i  ≤
bi  ≤ 2(i−1)/2 b∗i .
356 Lattice basis reduction

5. By part 3 of Lemma 17.2.8 we have b1  ≤ 2(i−1)/2 b∗i  and so


n

b1 n ≤ 2(i−1)/2 b∗i  = 2n(n−1)/4 det(L).
i=1

Corollary 17.2.13 If b1  ≤ b∗i  for all 1 ≤ i ≤ n then b1 is a correct solution to SVP.

Exercise 17.2.14 Prove Corollary 17.2.13.

Exercise 17.2.15 Suppose L is a lattice in Zm and let {b1 , . . . , bn } be an LLL-reduced


basis. Rename these vectors as v 1 , . . . , v n such that 1 ≤ v 1  ≤ v 2  ≤ · · · ≤ v n . Show
that one does not necessarily have v 1  = b1 . Show that, for 1 ≤ i ≤ n
 1/(n+1−i)
v i  ≤ 2n(n−1)/4 det(L) .

As a final remark, the results in this section have only given upper bounds on the
sizes of bi  in an LLL-reduced lattice basis. In many practical instances, one finds that
LLL-reduced lattice vectors are much shorter than these bounds might suggest.

17.3 The Gram–Schmidt algorithm


The LLL algorithm requires computing a Gram–Schmidt basis. For the complexity analysis
of the LLL algorithm it is necessary to give a more careful description and analysis of the
Gram–Schmidt algorithm than was done in Section A.10.2. We present pseudocode in
Algorithm 23 (the “downto” in line 4 is not necessary, but we write it that way for future
reference in the LLL algorithm).

Algorithm 23 Gram–Schmidt algorithm


Input: {b1 , . . . , bn } in Rm
Output: {b∗1 , . . . , b∗n } in Rm

1: b1 = b1
2: for i = 2 to n do
3: v = bi
4: for j := i − 1 downto 1 do
5: µi,j = bi , b∗j /b∗j , b∗j 
6: v = v − µi,j b∗j
7: end for
8: b∗i = v
9: end for
∗ ∗
10: return {b1 , . . . , b n }
17.3 The Gram–Schmidt algorithm 357

When working in R the standard way to implement this algorithm is using floating-point
arithmetic. However, problems can arise (especially if the b∗i decrease quickly in size).
Such issues are beyond the scope of this book; we refer to Higham [259] for details.
If the input vectors lie in Zm then one can perform Algorithm 23 using exact arithmetic
over Q. However, the integers can become very large (this is called coefficient explosion).
We now analyse the size of the integers and prove the complexity of the exact version of the
Gram–Schmidt algorithm. These results are used later when determining the complexity
of LLL.

Definition 17.3.1 Let b1 , . . . , bn be an ordered set of vectors in Zm . Define Bi = b∗i 2 (as


before). For 1 ≤ i ≤ n − 1 define the i × m matrix B(i) whose rows are b1 , . . . , bi . Define
d0 = 0 and, for 1 ≤ i ≤ n

di = det(B(i) B(i)
T
) = det(bj , bk 1≤j,k≤i ) ∈ Z,

which is the square of the volume of the sublattice generated by B(i) .

Lemma 17.3.2 Let the notation be as above.



1. di = ij =1 Bj for 1 ≤ i ≤ n.
2. Bi = di /di−1 for 1 ≤ i ≤ n.
3. di−1 b∗i ∈ L ⊆ Zn for 1 ≤ i ≤ n, where L is the lattice spanned by {b1 , . . . , bn }.
4. dj µi,j ∈ Z for 1 ≤ j < i ≤ n.

Proof

1. Write L(i) for the lattice spanned by the first i vectors (i.e., L is given by the matrix
 
B(i) ). Then di = det(L(i) )2 = ij =1 b∗j 2 = ij =1 Bj by Lemma 16.1.14.
2. This property follows immediately from the previous one.

3. Write b∗i = bi − i−1 j =1 ai,j bj for some ai,j ∈ R. Note that the sum is over vectors
bj not b∗j , so the ai,j are not the same as the µi,j . Since bl , b∗i  = 0 for 1 ≤ l < i
we have
i−1

bl , bi  = ai,j bl , bj ,
j =1

which corresponds to the matrix product

(bi , b1 , . . . , bi , bi−1 ) = (ai,1 , . . . , ai,i−1 )B(i−1) B(i−1)


T
.

Inverting B(i−1) B(i−1)


T
to solve for the ai,j gives di−1 ai,j ∈ Z. It follows that di−1 b∗i ∈
L ⊂ Zn as required.
4. By the previous results we have dj µi,j = dj −1 Bj bi , b∗j /Bj = bi , dj −1 b∗j  ∈ Z.

358 Lattice basis reduction
i−1
Exercise 17.3.3 Consider the vector v = bi − k=j µi,k b∗k in line 6 of Algorithm 23 during
iteration j . Show that
i−1

v2 = bi 2 − µ2i,k b∗k 2 .
k=j

Deduce that v ≤ bi  and that di−1 v ∈ Zm throughout the loop in line 4 of the algorithm.

Theorem 17.3.4 Let b1 , . . . , bn be vectors in Zm . Let X ∈ Z≥2 be such that bi 2 ≤ X for
1 ≤ i ≤ n. Then the Gram–Schmidt algorithm performs O(n4 m log(X)2 ) bit operations.
The output size is O(n2 m log(X)).

Proof One runs Algorithm 23 using exact Q arithmetic for the vectors b∗i . Lemma 17.3.2

shows that the denominators in b∗i are all factors of di−1 , which has size i−1 j =1 Bj ≤
i−1 2 ∗
j =1 bj  ≤ X
i−1
. Also, bi  ≤ bi  ≤ X, so the numerators are bounded by Xi . The
size of each vector b∗i and, by Exercise 17.3.3, the intermediate steps v in the computation
are therefore O(mi log(X)) bits, which gives the output size of the algorithm. The compu-
tation bi , b∗j  requires O(mn log(X)2 ) bit operations and the computation b∗j , b∗j  requires
O(mn2 log(X)2 ) bit operations. As there are O(n2 ) vector operations to perform, one gets
the stated running time. �

Corollary 17.3.5 Let the notation be as in Theorem 17.3.4 and let L be the lattice in Zm
with basis {b1 , . . . , bn }. Then one can compute det(L)2 in O(n4 m log(X)2 ) bit operations.3

Proof Lemma 16.1.14 implies det(L)2 = ni=1 b∗i 2 . One computes b∗i using exact (naive)
arithmetic over Q in O(n4 m log(X)2 ) bit operations. One computes each b∗i 2 ∈ Q in
O(mn2 log(X)2 ) bit operations. Since b∗i 2 ≤ X and di−1 b∗i 2 ∈ Z it follows that b∗i 2
is a ratio of integers bounded by Xn . One computes the product of the b∗i 2 in O(n3 log(X)2 )
2
bit operations (since the integers in the product are bounded by Xn ). Finally, one can reduce
the fraction using Euclid’s algorithm and division in O(n4 log(X)2 ) bit operations. �

17.4 The LLL algorithm


The Lenstra–Lenstra–Lovász (LLL) algorithm is an iterative algorithm that transforms
a given lattice basis into an LLL-reduced one. Since the definition of LLL-reduced uses
Gram–Schmidt vectors, the algorithm performs the Gram–Schmidt method as a subroutine.
The first condition of Definition 17.2.4 is easily met by taking suitable integer linear
combinations. If the second condition is not met then bi is not significantly longer than bi−1 .
In this case, we swap bi and bi−1 and backtrack. The swapping of vectors is familiar from
the Lagrange–Gauss two-dimensional lattice basis reduction algorithm and also Euclid’s
algorithm. We give the precise details in Algorithm 24.

3
Since det(L)2 ∈ Z while det(L) may not be rational if n < m, we prefer to work with det(L)2 .
17.4 The LLL algorithm 359

Algorithm 24 LLL algorithm with Euclidean norm (typically, choose δ = 3/4)


Input: b1 , . . . , bn ∈ Zm .
Output: LLL reduced basis b1 , . . . , bn
∗ ∗
1: Compute the Gram–Schmidt basis b1 , . . . , b n and coefficients µi,j for 1 ≤ j < i ≤ n
∗ ∗ ∗ 2
2: Compute Bi = bi , bi  = bi  for 1 ≤ i ≤ n
3: k = 2
4: while k ≤ n do
5: for j = (k − 1) downto 1 do " Perform size reduction
6: Let qj = µk,j and set bk = bk − qj bj
7: Update the values µk,j for 1 ≤ j < k
8: end for
9: if Bk ≥ (δ − µ2k,k−1 )Bk−1 then " Check Lovász condition
10: k =k+1
11: else
12: Swap bk with bk−1
13: Update the values b∗k , b∗k−1 , Bk , Bk−1 , µk−1,j and µk,j for 1 ≤ j < k, and
µi,k , µi,k−1 for k < i ≤ n
14: k = max{2, k − 1}
15: end if
16: end while

Lemma 17.4.1 Throughout the LLL algorithm the values b∗i and Bi for 1 ≤ i ≤ n and µi,j
for 1 ≤ j < i ≤ n are all correct Gram–Schmidt values.

Exercise 17.4.2 Prove Lemma 17.4.1. In other words, show that line 6 of the LLL algorithm
does not change b∗i or Bi for 1 ≤ i ≤ n. Similarly, line 12 of the algorithm does not change
any values except those mentioned in line 13.

It is illuminating to compare the LLL algorithm with the Lagrange–Gauss reduction


algorithm. The basic concept of size reduction followed by a swap is the same; there are,
however, two crucial differences.

1. The size reduction operation in the Lagrange–Gauss algorithm gives the minimal value
for b2 + qb1  over q ∈ Z. In LLL, the coefficient µk,j is chosen to depend on bk and b∗j
so it does not necessarily minimise bk . Indeed, bk  can grow during the algorithm.
Of course, in the two-dimensional case of LLL then µ2,1 is the same as the value used
in the Lagrange–Gauss algorithm and so the size reduction step is the same.
2. The size check in LLL (the Lovász condition) is on the lengths of the Gram–Schmidt
vectors, unlike the size check in the Lagrange–Gauss algorithm, which is on the length
of the basis vectors themselves.

These features of LLL may seem counterintuitive, but they are essential to the proof that
the algorithm runs in polynomial-time.
360 Lattice basis reduction

Lemma 17.4.3 If bk and bk−1 are swapped then the Gram–Schmidt vectors b∗i for 1 ≤ i ≤ n
are changed as follows:
1. For 1 ≤ i < k − 1 and k < i < n, b∗i is unchanged.
2. The new value for b∗k−1 is b∗k + µk,k−1 b∗k−1 and the new value for Bk−1 is Bk = Bk +
µ2k,k−1 Bk−1 .
3. The new value for b∗k is (Bk /Bk−1

)b∗k−1 − (µk,k−1 Bk−1 /Bk−1

)b∗k and the new value for

Bk is Bk−1 Bk /Bk−1 .

Proof Denote by bi the new basis (i.e., bk−1 = bk and bk = bk−1 ), bi and µi,j the new
∗ ∗
Gram–Schmidt values and Bi the squares of the lengths of the bi . Clearly, bi = b∗i for

1 ≤ i < k − 1 and µi,j = µi,j for 1 ≤ j < i < k − 1. Now
k−2
 ∗
bk−1

= bk−1 − µk−1,j bj
j =1
k−2

= bk − µk,j b∗j
j =1
= b∗k + µk,k−1 b∗k−1 .

Hence, Bk−1 = Bk + µ2k,k−1 Bk−1 .
Similarly
k−1

∗ ∗
bk = bk − µk,j bj
j =1
k−2
   ∗
= bk−1 − µk−1,j b∗j − bk−1 , bk−1
∗ 
/Bk−1 bk−1
j =1
  ∗
= b∗k−1
− bk−1 , b∗k + µk,k−1 b∗k−1 /Bk−1

(bk + µk,k−1 b∗k−1 )
 
= b∗k−1
− µk,k−1 Bk−1 /Bk−1
(b∗k + µk,k−1 b∗k−1 )
    ∗
= 1 − µ2k,k−1 Bk−1 /Bk−1

b∗k−1 − µk,k−1 Bk−1 /Bk−1 
bk .

The result for bk follows since 1 − µ2k,k−1 Bk−1 /Bk−1
 
= Bk /Bk−1 . Finally
2 2
Bk = (Bk2 b∗k−1 , b∗k−1 /Bk−1

+ µ2k,k−1 Bk−1
2
b∗k , b∗k /Bk−1
 
= Bk−1 Bk /Bk−1 . �

Exercise 17.4.4 Give explicit formulae for updating the other Gram–Schmidt values in
lines 7 and 13 of Algorithm 24.
Exercise 17.4.5 Show that the condition in line 9 of Algorithm 24 can be checked imme-
diately after µk,k−1 has been computed. Hence, show that the cases 1 ≤ j < k − 1 in the
loop in lines 5 to 8 of Algorithm 24 can be postponed to line 10.
Lemma 17.4.6 If the LLL algorithm terminates then the output basis is LLL-reduced.
17.4 The LLL algorithm 361

Exercise 17.4.7 Prove Lemma 17.4.6. Indeed, the fact that the Lovász conditions are
satisfied is immediate. Prove the bounds on the µi,j using the three following steps. Let
1 ≤ j < k and let bk = bk − µk,j bj .

1. Prove that bj , bj∗  = bj∗ , bj∗  and bj , bi∗  = 0 if j < i.
2. Hence, writing µk,j = bk , bj∗ /bj∗ , bj∗ , prove that |µk,j | ≤ 1/2 for 1 ≤ j < k.
3. For j < i < k denote µk,i = bk , bi∗ /bi∗ , bi∗ . Prove that µk,i = µk,i .

In the next section we show that the LLL algorithm does terminate. Before then we give
an example and some further discussion.

Example 17.4.8 Let L be the lattice with basis matrix


 
1 0 0
B = 4 2 15 .
0 0 3
We will perform the LLL algorithm to reduce this lattice basis.
We start with k = 2 and compute µ2,1 = 4/1 = 4. So q1 = 4 and we define

b2 = b2 − 4b1 = (4, 2, 15) − (4, 0, 0) = (0, 2, 15).

We now want to check the second LLL condition. To do this we need b∗2 . We compute
µ2,1 = 0 and hence b∗2 = b2 . Then B1 = 1 and B2 = b∗2 , b∗2  = 229. Clearly, B2 > (3/4 −
µ22,1 )B1 and so we set k = 3. Now consider b3 . We compute µ3,2 = 45/229 ≈ 0.19 and,
since q2 = 0 there is no reduction to be performed on b3 . We compute µ3,1 = 0, so again
no size reduction is required. We now compute
45 ∗
b∗3 = b3 − b
229 2
= (0, −90/229, 12/229).

We have B2 = 229 and B3 = b∗3 , b∗3  = 8244/52441 ≈ 0.157. From this, one can check
that B3 < (3/4 − µ23,2 )B2 ≈ 166.1. Hence, we swap b2 and b3 and set k = 2.
At this point we have the vectors

b1 = (1, 0, 0) and b2 = (0, 0, 3)

and b∗1 = b1 , b∗2 = b2 . First, check that µ2,1 = 0 and so no size reduction on b2 is required.
Second, B1 = 1 and B2 = 9 and one checks that B2 > (3/4 − µ22,1 )B1 = 0.75. Hence, we
set k = 3. Now

b3 = (0, 2, 15)

and we compute µ3,2 = 45/9 = 5. Hence, we reduce

b3 = b3 − 5b2 = (0, 2, 0).

Now compute µ3,1 = 0, so no reduction is required.


One computes µ3,2 = 0, b∗3 = b3 and B3 = 4. Hence, B3 < (3/4 − µ23,2 )B2 = 27/4 =
6.75 and so we should swap b2 and b3 and set k = 2. One can check that the k = 2 phase runs
362 Lattice basis reduction

without making any changes. We have B1 = 1 and B2 = 4. Consider now k = 3 again. We


have µ3,2 = µ3,1 = 0 and so b3 remains unchanged. Finally, B3 = 9 > (3/4 − µ23,2 )B2 = 3
and so we set k = 4 and halt.

Exercise 17.4.9 Perform the LLL algorithm by hand on the basis

{(−1, 5, 0), (2, 5, 0), (8, 6, 16)} .

Exercise 17.4.10 Perform the LLL algorithm by hand on the basis

{(0, 3, 4), (−1, 3, 3), (5, 4, −7)} .

Remark 17.4.11 Part 1 of Theorem 17.2.12 shows we have b1  ≤ 2(n−1)/2 λ1 . In other
words, the LLL algorithm solves SVP up to an exponential factor, but is not guaranteed to
output a shortest vector in the lattice. Hence, LLL does not officially solve SVP.
In practice, at least for relatively small dimensions, the vector b1 output by the LLL
algorithm is often much closer to the shortest vector than this bound would suggest, and in
many cases will be a shortest vector in the lattice. In Example 17.4.8, the theoretical bound
gives b1  ≤ 2, so (0, 2, 0) would have been a possible value for b1 .

17.5 Complexity of LLL


We now show that the LLL algorithm terminates and runs in polynomial-time. The original
paper of Lenstra, Lenstra and Lovász [335] proves polynomial termination for any lattice
in Rm but only gives a precise complexity for lattices in Zm .

Theorem 17.5.1 Let L be a lattice in Zm with basis b1 , . . . , bn and let X ∈ Z≥2 be such
that bi 2 ≤ X for 1 ≤ i ≤ n. Let 1/4 < δ < 1. Then the LLL algorithm with factor δ
terminates and performs O(n2 log(X)) iterations.

Proof We need to bound the number of “backtracks” in Algorithm 24. This number is
at most n plus the number of swaps. So it suffices to bound the number of swaps by
O(n2 log(X)).
For 1 ≤ i ≤ n − 1 define the i × m matrix B(i) formed by the first i basis vectors for the
lattice. Define di = det(B(i) B(i)
T
) ∈ Z, which is the square of the volume of the sublattice
generated by the rows of B(i) . Hence
i
i
i

di = Bj = b∗i 2 ≤ bi 2 ≤ Xi .
j =1 j =1 j =1

Define
n−1
n−1

D= di = Bin−i .
i=1 i=1

(n−1)n/2
It follows that D ≤ X .
17.5 Complexity of LLL 363

Two vectors bk and bk−1 are swapped when Bk < (δ − µ2k,k−1 )Bk−1 . By Lemma 17.4.3,

the new values for Bk−1 and Bk are Bk−1 = Bk + µ2k,k−1 Bk−1 and Bk = Bk−1 Bk /Bk−1

. Let
 
di be the new values for the di . We have di = di when 1 ≤ i < k − 1. By the Lovász
  
condition Bk−1 ≤ δBk−1 . Hence, dk−1 ≤ δdk−1 . Finally, since Bk−1 Bk = Bk−1 Bk we have
di = di for k ≤ i ≤ n. Hence, swapping bk and bk−1 always strictly reduces D.
On the other hand, we always have4 di ∈ Z and so D ≥ 1. It follows that the number
of swaps in the LLL algorithm is at most5 logδ (X(n−1)n/2 ) = O(n2 log(X)). Hence, the
algorithm requires O(n2 log(X)) iterations of the main loop. �
Algorithm 24 and Theorem 17.5.1 provide a proof that an LLL-reduced basis exists for
every lattice.
Exercise 17.5.2 Let n ∈ N. Show that Hermite’s constant satisfies γn ≤ 2(n−1)/4 (this bound
can be improved to (4/3)(n−1)/2 ; see [335]).
It is clear that if L ⊂ Zm then LLL can be implemented using exact Q arithmetic, and
hence exact integer arithmetic. But we need to show that the size of the integers does
not explode. The analysis given already for the Gram–Schmidt algorithm (for example,
Lemma 17.3.2) provides most of the tools we need.
Theorem 17.5.3 Let L be a lattice in Zm with basis b1 , . . . , bn and let X ∈ Z≥2 be such
that bi 2 ≤ X for 1 ≤ i ≤ n. Then the LLL algorithm requires arithmetic operations on
integers of size O(n log(X)).
Proof (Sketch) The bounds on the sizes of the b∗i follow the same methods as used in the
proof of Theorem 17.3.4. Since b∗i  is never increased during the algorithm (indeed, the
vectors are specifically permuted to reduce the b∗i ) we have b∗i  ≤ X1/2 at the end of
each iteration. Since di−1 b∗i ∈ Zn and |di−1 | ≤ Xi−1 it follows that the entries of b∗i can be
written as n∗i,j /di−1 where |n∗i,j | ≤ Xi .
Let us now consider the values bi 2 at the end of each iteration. These values all start
bounded by X. As the algorithm proceeds, the values are either not yet changed (and hence
still bounded by X) or have been modified so that the Gram–Schmidt basis is size reduced
(and possibly swapped to an earlier position in the list of vectors). After each size reduction
step (and before swapping) we have
i−1

bi = b∗i + µi,j b∗j
j =1

with −1/2 ≤ µi,j ≤ 1/2 and so


i−1

2
bi  = b∗i 2 + µ2i,j b∗j 2 ≤ nX. (17.2)
j =1

4
To apply this argument it is necessary to use the square of the determinant. An integer lattice that does not have full rank does
not necessarily have integer determinant.
5
Recall that 1/4 < δ < 1 is considered as a fixed constant.
364 Lattice basis reduction

Hence, we have bi ò nX at the end of each iteration and so the entries of bi are all
integers bounded by nX.
The remaining detail is to bound the sizes of the µi,j and the sizes of intermediate values
in line 6 of Algorithm 24. We refer to the proof of Proposition 1.26 of [335] for the bounds
|µi,j | ≤ 2n−i (nXn−1 )1/2 and for further details. �
Corollary 17.5.4 Let L be a lattice in Zm with basis b1 , . . . , bn and let X ∈ Z≥2 be such
that bi 2 ≤ X for 1 ≤ i ≤ n. Then the LLL algorithm requires O(n3 m log(X)) arithmetic
operations on integers of size O(n log(X)). Using naive arithmetic gives running time
O(n5 m log(X)3 ) bit operations.
Proof Theorem 17.5.1 implies that the algorithm requires O(n2 log(X)) iterations of the
main loop. Within each iteration there are n operations on vectors of length m. Hence,
O(n3 m log(X)) arithmetic operations. Theorem 17.5.3 implies that all arithmetic operations
are on integers of size O(n log(X)). �
Remark 17.5.5
1. Since the input size is O(nm log(X)) and n ≤ m the running time is cubic in the input
size.
2. Note that the bounds on the sizes of integers involved in the LLL algorithm are
O(n log(X)) bits for the values µi,j and entries of b∗i , while only O(log(n) + log(X))
for the entries in the vectors bi . This is not just an artifact of the proof, but is a genuine
phenomenon; it can already be seen in Example 17.4.8 where the bi all have very small
integer coordinates and yet µ2,1 = 45/229.
This leads to the idea of representing the µi,j and b∗i using approximate (floating-
point) arithmetic and keeping exact integer arithmetic only for the bi . Variants of LLL
using floating-point arithmetic for the Gram–Schmidt vectors are much faster than the
basic LLL algorithm presented in this chapter. Indeed, the basic algorithm is almost
never used in practice.
A problem with using floating-point approximations is that comparisons now become
inexact, and this leads to problems with both termination and correctness of the output.
Implementing and analysing floating-point LLL algorithms is beyond the scope of this
book. We refer to Stehlé [523] and Schnorr [472] for surveys of this topic.
3. One can show (e.g., using equation (17.2)) that the complexity statement holds also for
X = max{b∗i  : 1 ≤ i ≤ n}, which could be smaller than max{bi  : 1 ≤ i ≤ n}.
4. Sometimes one is interested in reducing lattice bases that are in Qm and not Zm . Suppose
all rational numbers in the basis B have numerator and denominator bounded by X. One
can obtain an integer matrix by multiplying B by an integer that clears all denominators,
but the resulting integers could be as big as Xmn . This gives a worst-case complexity of
O(n8 m4 log(X)3 ) bit operations for lattice basis reduction.
Some applications such as simultaneous diophantine approximation (see Sec-
tion 19.5) and the hidden number problem (see Section 21.7.1) have at most m non-
integer entries, giving a complexity of O(n5 m4 log(X)3 ) bit operations.
17.6 Variants of the LLL algorithm 365

17.6 Variants of the LLL algorithm


There are many refinements of the LLL algorithm that are beyond the scope of the brief
summary in this book. We list some of these now.
� As mentioned earlier, it is necessary to use floating-point arithmetic to obtain a fast
version of the LLL algorithm. A variant of floating-point LLL whose running time grows
quadratically in log(X) (rather than cubicly, as usual) is the L2 algorithm of Nguyen and

� Schnorr–Euchner “deep insertions”. The idea is that, rather than just swapping bk and
Stehlé [407] (also see Stehlé [523]).

bk−1 in the LLL algorithm, one can move bk much earlier in the list of vectors if Bk is
sufficiently small. With standard LLL we have shown that swapping bk and bk−1 changes
Bk to Bk + µ2k,k−1 Bk−1 . A similar argument shows that inserting bk between bi−1 and bi
for some 1 < i < k changes Bk to
k−1

B = Bk + µ2k,j Bj .
j =1

Hence, one can let i be the smallest index such that B < 34 Bi and insert bk between bi−1
and bi (i.e., reorder the vectors bi , . . . , bk as bk , bi , . . . , bk−1 ). We refer to Schnorr and

� Our presentation of the LLL algorithm was for the Euclidean norm. The algorithm has
Euchner [473] and Section 2.6.2 of Cohen [127] for more details.

been extended to work with any norm by Lovász and Scarf [357] (also see Kaib and
Ritter [291]).
In practice, if one wants results for a norm other than the Euclidean norm, one usually
performs ordinary LLL reduction with respect to the Euclidean norm and then uses
the standard relations between norms (Lemma A.10.2) to determine the quality of the

� Another important approach to lattice basis reduction is the block Korkine–Zolotarev


resulting vectors.

algorithm due to Schnorr [468]. We mention this further in Section 18.5.


18
Algorithms for the closest and shortest
vector problems

This chapter presents several algorithms to find lattice vectors close to a given vector.
First we consider two methods due to Babai that, although not guaranteed to solve the
closest vector problem, are useful in several situations in the book. Then we present an
exponential-time algorithm to enumerate all vectors close to a given point. This algorithm
can be used to solve the closest and shortest vector problems. We then briefly mention a
lattice basis reduction algorithm that is guaranteed to yield better approximate solutions to
the shortest vector problem.
The closest vector problem (CVP) was defined in Section 16.3. First, we remark that the
shortest distance from a given vector w ∈ Rn to a lattice vector v ∈ L can be quite large
compared with the lengths of short vectors in the lattice.
Example 18.0.1 Consider the lattice in R2 with basis (1, 0) and (0, 1000). Then w =
(0, 500) has distance 500 from the closest lattice point, despite the fact that the first
successive minimum is 1.

Exercise 18.0.2 Let L = Zn and w = (1/2, . . . , 1/2). Show that w − v ≥ n/2 for all
v ∈ L. Hence, show that if n > 4 then w − v > λn for all v ∈ L.

18.1 Babai’s nearest plane method


Let L be a full rank lattice given by an (ordered) basis {b1 , . . . , bn } and let {b∗1 , . . . , b∗n }
be the corresponding Gram–Schmidt basis. Let w ∈ Rn . Babai [17] presented a method to
inductively find a lattice vector close to w. The vector v ∈ L output by Babai’s method is
not guaranteed to be such that w − v is minimal. Theorem 18.1.6 shows that if the lattice
basis is LLL-reduced then w − v is within an exponential factor of the minimal value.
We now describe the method. Define U = span{b1 , . . . , bn−1 } and let L = L ∩ U be
the sublattice spanned by {b1 , . . . , bn−1 }. The idea of the nearest plane method is to find a
vector y ∈ L such that the distance from w to the plane U + y is minimal. One then sets w
to be the orthogonal projection of w onto the plane U + y (in other words, w ∈ U + y and
w − w ∈ U ⊥ ). Let w = w  − y ∈ U . Note that if w
∈ L then w
∈ L. One inductively
solves the (lower dimensional) closest vector problem of w  in L to get y  ∈ L . The
solution to the original instance of the CVP is v = y + y  .

366
18.1 Babai’s nearest plane method 367

y w U +y
� �

�w


b∗n bn U + bn

w

U
Figure 18.1 Illustration of the Babai nearest plane method. The x-axis represents the subspace U
(which has dimension n − 1) and the y-axis is perpendicular to U .

We now explain how to algebraically find y and w .

Lemma 18.1.1 Let


n

w= lj b∗j (18.1)
j =1


with lj ∈ R. Define y = ln bn ∈ L and w = n−1 ∗ ∗
j =1 lj bj + ln bn . Then y is such that the

distance between w and U + y is minimal, and w is the orthogonal projection of w onto
U + y.

Proof We use the fact that U = span{b∗1 , . . . , b∗n−1 }. The distance from w to U + y is

inf w − (u + y).
u∈U


Let w be as in equation (18.1) and let y = nj=1 lj bj be any element of L for lj ∈ Z. One

can write y = n−1  ∗  ∗ 
j =1 lj bj + ln bn for some lj ∈ R, 1 ≤ j ≤ n − 1.
Lemma A.10.5 shows that, for fixed w and y, w − (u + y)2 is minimised by u =
n−1  ∗
j =1 (lj − lj )bj ∈ U . Indeed

w − (u + y)2 = (ln − ln )2 b∗n 2 .

It follows that one must take ln = ln , and so the choice of y in the statement of the Lemma
is correct (note that one can add any element of L to y and it is still a valid choice).
368 Close and short vectors

The vector w satisfies


n−1

w − y = lj b∗j + ln (bn∗ − bn ) ∈ U
j =1

which shows that w  ∈ U + y. Also


n
 n−1


w−w = lj b∗j − lj b∗j − ln b∗n = (ln − ln )b∗n (18.2)
j =1 j =1

which is orthogonal to U . Hence, w  is the orthogonal projection of w onto U + y. �


n−1
Exercise 18.1.2 Let the notation be as above and write bn = b∗n + i=1 µn,i b∗i . Show that
n−1

w  = (li − ln µn,i )b∗i .
i=1

Exercise 18.1.3 Let {b1 , . . . , bn } be an ordered basis for a lattice L. Let w ∈ Rn and
suppose that there is an element v ∈ L such that v − w < 12 b∗i  for all 1 ≤ i ≤ n. Prove
that the nearest plane algorithm outputs v.

The following Lemma is needed to prove the main result, namely Theorem 18.1.6.

Lemma 18.1.4 Let {b1 , . . . , bn } be LLL-reduced (with respect to the Euclidean norm, and
with factor δ = 3/4). If v is the output of Babai’s nearest plane algorithm on input w then

w − v2 ≤ 2n −1
4
b∗n 2 .

Proof We prove the result by induction. Certainly, if n = 1 then w − v2 ≤ 14 b∗1 2 as


required.
Now suppose n ≥ 2. Recall that the output of the method is v = y + y  where y ∈ L
minimises the distance from w to U + y, w is the orthogonal projection of w onto U + y,
and y  is the output of the algorithm on w  = w  − y in L . By the inductive hypothesis we
know that w − y  2 ≤ 14 (2n−1 − 1)b∗n−1 2 . Hence

w − (y + y  )2 = w − w + w − (y + y  )2
= w − w 2 + w − y  2
≤ 14 b∗n 2 + 2n−1 −1
4
b∗n−1 2

using equation (18.2).


Finally, part 1 of Lemma 17.2.8 implies that this is

≤ 14 + 2 2 4 −1 b∗n 2
n−1

from which the result follows. �


18.1 Babai’s nearest plane method 369

Exercise 18.1.5 Prove that if v is the output of the nearest plane algorithm on input w then
n
1 ∗ 2
v − w2 ≤ b  .
4 i=1 i

Theorem 18.1.6 If the basis {b1 , . . . , bn } is LLL-reduced (with respect to the Euclidean
norm and with factor δ = 3/4) then the output of the Babai nearest plane algorithm on
w ∈ Rn is a vector v such that v − w < 2n/2 u − w for all u ∈ L.

Proof We prove the result by induction. For n = 1, v is a correct solution to the closest
vector problem and so the result holds.
Let n ≥ 2 and let u ∈ L be a closest vector to w. Let y be the vector chosen in the first
step of the Babai method. We consider two cases.

1. Case u ∈ U + y. Then u − w2 = u − w 2 + w − w2 so u is also a closest vector


to w  . Hence, u − y is a closest vector to w  = w  − y ∈ U . Let y  be the output of the
Babai nearest plane algorithm on w . By the inductive hypothesis

y  − w   < 2(n−1)/2 u − y − w .

Substituting w  − y for w gives

y + y  − w  < 2(n−1)/2 u − w .

Now

v − w2 = y + y  − w  2 + w − w2 < 2n−1 u − w 2 + w − w2 .

Using u − w , w  − w ≤ u − w and 2n−1 + 1 ≤ 2n gives the result.


2. Case u
∈ U + y. Since the distance from w to U + y is ≤ 12 b∗n , we have w − u ≥
1
2
b∗n . By Lemma 18.1.4 we find
"
1 ∗ 1 4
2
bn  ≥ 2 w − v.
2n − 1
Hence, w − v < 2n/2 w − u.

This completes the proof. �

One can obtain a better result by using the result of Lemma 17.2.9.

Theorem 18.1.7 If the basis {b1 , √


. . . , bn } is LLL-reduced with respect to the Euclidean
norm and with factor δ = 1/4 + 1/ 2 then the output of the Babai nearest plane algorithm
on w ∈ Rn is a vector v such that
2n/4
v − w < √ u − w < (1.6)2n/4 u − w
2−1
for all u ∈ L.
370 Close and short vectors

Exercise 18.1.8 Prove Theorem 18.1.7.


[Hint: First prove
√ that the analogue of Lemma 18.1.4 in this case is w − v2 ≤
(2 − 1)/(4( 2 − 1))b∗n 2 . Then follow
n/2
the proof of Theorem 18.1.6 using the fact
 (n−1)/4 √ 2  n/4 √ 2
that 2 / 2−1 +1≤ 2 / 2 − 1 .]

Algorithm 25 is the Babai nearest plane algorithm. We use the notation y n = y, w n = w,


wn−1 = w  etc. Note that Babai’s algorithm can be performed using exact arithmetic over
Q or using floating-point arithmetic.

Algorithm 25 Babai nearest plane algorithm


Input: {b1 , . . . , bn }, w
Output: v
Compute Gram–Schmidt basis b∗1 , . . . , b∗n
Set wn = w
for i = n downto 1 do
Compute li = wi , b∗i /b∗i , b∗i 
Set y i = li bi
Set wi−1 = wi − (li − li )b∗i − li bi
end for
return v = y 1 + · · · + y n

Exercise 18.1.9 Let {b1 , . . . , bn } be a basis for a lattice in Zn . Let X ∈ R>0 be such that
bi  ≤ X for 1 ≤ i ≤ n. Show that the complexity of the Babai nearest plane algorithm
(not counting LLL) when using exact arithmetic over Q is O(n5 log(X)2 ) bit operations.

Exercise 18.1.10 If {b1 , . . . , bn } is an ordered LLL-reduced basis then b1 is likely to be


shorter than bn . It would therefore be more natural to start with b1 and define U to be the
orthogonal complement of b1 . Why is this not possible?

Example 18.1.11 Consider the LLL-reduced basis


 
1 2 3
B = 3 0 −3
3 −7 3

and the vector w = (10, 6, 5) ∈ R3 . We perform the nearest plane method to find a lattice
vector close to w.
First compute the Gram–Schmidt basis b∗1 = (1, 2, 3), b∗2 = (24/7, 6/7, −12/7) and

b3 = (10/3, −20/3, 10/3). Write
37 ∗ 3 ∗
w= b
14 1
+ 2b∗2 + b .
20 3

Since 3/20 = 0 we have y 3 = 0 and w  = w  = 37 b∗ + 2b∗2 = (19/2, 7, 9/2). The pro-


14 1

cess is continued inductively, so write w = w . Then one takes y 2 = 2b2 = (6, 0, −6) and
18.2 Babai’s rounding technique 371

w  = w − y 2 = 7/2b∗1 . Since 7/2 = 3 we return the solution

3b1 + 2b2 = (9, 6, 3).

Exercise 18.1.12 Show that the vector v output by the Babai nearest plane method lies in
the parallelepiped
 
  n 
w+ lj b∗j : lj ∈ R, |lj | ≤ 12
 
j =1

centered on w. Show that this parallelepiped has volume equal to the volume of the lattice.
Hence, show that if w does not lie in the lattice then there is exactly one lattice point in this
parallelepiped.

Some improvements to the Babai nearest plane algorithm are listed in Section 3.4 of
[233]. Similar methods (but using a randomised choice of plane) were used by Klein [306]
to solve the CVP when the target vector is particularly close to a lattice point. Another
variant of the nearest plane algorithm is given by Lindner and Peikert [352]. The nearest
plane algorithm is known by the name “VBLAST” in the communications community
(see [396]).

18.2 Babai’s rounding technique


An alternative to the nearest plane method is the rounding technique. This is simpler to
compute in practice, since it does not require the computation of a Gram–Schmidt basis, but
harder to analyse in theory. This method is also not guaranteed to solve CVP. Let b1 , . . . , bn
be a basis for a full rank lattice in Rn . Given a target w ∈ Rn we can write
n

w= li b i
i=1

with li ∈ R. One computes the coefficients li by solving the system of linear equations
(since the lattice is full rank we can also compute the vector (l1 , . . . , ln ) as wB −1 ). The
rounding technique is simply to set
n

v= li bi
i=1

where l is the closest integer to the real number l. This procedure can be performed
using any basis for the lattice. Babai has proved that v − w is within an exponential
factor of the minimal value if the basis is LLL-reduced. The method trivially generalises to
non-full-rank lattices as long as w lies in the R-span of the basis.

Theorem 18.2.1 Let b1 , . . . , bn be an LLL-reduced basis (with respect to the Euclidean


norm and with factor δ = 3/4) for a lattice L ⊆ Rn . Then the output v of the Babai rounding
372 Close and short vectors
� � � � �

� � 1 � � �

w

� � � � �

−1 1

� � � � �

Figure 18.2 Parallelepiped centered at (−0.4, 0.4) corresponding to lattice basis (3, 2) and (2, 1)

method on input w ∈ Rn satisfies


w − v ≤ (1 + 2n(9/2)n/2 )w − u
for all u ∈ L.
Proof See Babai [17]. �
n
Babai rounding gives a lattice point v such that w − v = i=1 mi bi where |mi | ≤ 1/2.
In other words, v lies in the parallelepiped, centered at w, defined by the basis vectors.
Since the volume of the parallelepiped is equal to the volume of the lattice, if w is not in
the lattice then there is exactly one lattice point in the parallelepiped. The geometry of the
parallelepiped determines whether or not an optimal solution to the CVP is found. Hence,
though the rounding method can be used with any basis for a lattice, the result depends on
the quality of the basis.
Example 18.2.2 Let b1 = (3, 2) and b2 = (2, 1) generate the lattice Z2 . Let w =
(−0.4, 0.4) so that the solution to CVP is (0, 0). One can verify that (−0.4, 0.4) =
1.2b1 − 2b2 and so Babai rounding yields b1 − 2b2 = (−1, 0). Figure 18.2 shows the
parallelepiped centered at w corresponding to the basis. One can see that (−1, 0) is the only
lattice point within that parallelepiped.
Exercise 18.2.3 Consider the vector w = (−0.4, 0.4) as in Example 18.2.2 again. Using
the basis {(1, 0), (0, 1)} for Z2 use the Babai rounding method to find the closest lattice
vector in Z2 to w. Draw the parallelepiped centered on w in this case.
We stress that the rounding method is not the same as the nearest plane method. The
next example shows that the two methods can give different results.
Example 18.2.4 Consider the CVP instance in Example 18.1.11. We have
141 241 3
w= b + b + b .
40 1 120 2 20 3
18.3 The embedding technique 373

Hence, one sets

v = 4b1 + 2b2 = (10, 8, 6).

solution to the one found in Example 18.1.11, though both


Note that this is a different √
solutions satisfy w − v = 5.

Exercise 18.2.5 Prove that if b1 , . . . , bn are orthogonal basis vectors for a lattice L then
the Babai rounding technique produces a correct solution to the CVP with respect to the
Euclidean norm. Show also that the Babai rounding technique gives the same result as the
Babai nearest plane method in this case.

Exercise 18.2.6 Show that the nearest plane and rounding methods produce a linear
combination of the lattice basis where the vector bn has the same coefficient.

Exercise 18.2.7 Consider the lattice with basis


 
7 0 1
 1 17 1
−3 0 10

and let

w = (100, 205, 305).

Find a lattice vector v close to w using the rounding technique. What is v − w?

The Babai rounding algorithm is known by the name “zero forcing” in the communica-
tions community (see [396]).

18.3 The embedding technique


Another way to solve CVP is the embedding technique, due to Kannan (see page 437
onwards of [296]). Let B be a basis matrix for a lattice L and suppose w ∈ Rn (in practice,
we assume w ∈ Qn ). A solution to the CVP corresponds to integers l1 , . . . , ln such that
n

w≈ li b i .
i=1

The crucial observation that e = w − ni=1 li bi is such that e is small.
The idea of the embedding technique is to define a lattice L that contains the short vector
e. Let M ∈ R>0 (for example, M = 1). The lattice L is defined by the vectors (which are
a basis for Rn+1 )

(b1 , 0), . . . , (bn , 0), (w, M). (18.3)


374 Close and short vectors

One sees that taking the linear combination of rows with coefficients (−l1 , . . . , −ln , 1)
gives the vector

(e, M).

Hence, we might be able to find e by solving the SVP problem in the lattice L . One can
then solve the CVP by subtracting e from w.

Example 18.3.1 Consider the basis matrix


 
35 72 −100
B = −10 0 −25 
−20 −279 678

for a lattice in R3 . We solve the CVP instance with w = (100, 100, 100).
Apply the LLL algorithm to the basis matrix (taking M = 1)
 
35 72 −100 0
−10 0 −25 0
 
−20 −279 678 0
100 100 100 1

for the lattice L . This gives the basis matrix


 
0 1 0 1
5 0 1 0
 .
0 5 1 −4
5 5 −21 −4
The first row is (0, 1, 0, 1), so we know that (0, 1, 0) is the difference between w and a
lattice point v. One verifies that v = (100, 100, 100) − (0, 1, 0) = (100, 99, 100) is a lattice
point.

The success of the embedding technique depends on the size of e compared with the
lengths of short vectors in the original lattice L. As we have seen in Exercise 18.0.2, e can
be larger than λn , in which case the embedding technique is not likely to be a good way to
solve the closest vector problem.

Lemma 18.3.2 Let {b1 , . . . , bn } be a basis for a lattice L ⊆ Zn and denote by λ1 the
shortest Euclidean length of a non-zero element of L. Let w ∈ Rn and let v ∈ L be a closest
lattice point to w. Define e = w − v. Suppose that e < λ1 /2 and let M = e. Then
(e, M) is a shortest non-zero vector in the lattice L of the embedding technique.

Proof All vectors in the lattice L are of the form


n

ln+1 (e, M) + li (bi , 0)
i=1
18.4 Enumerating all short vectors 375

for some l1 , . . . , ln+1 ∈ Z. Every non-zero vector with ln+1 = 0 is of length at least λ1 .
Since

(e, M)2 = e2 + M 2 = 2M 2 < 2λ21 /4



the vector (e, ±M) has length at most λ1 / 2. Since v is a closest vector to w it follows
that e ≤ e + x for all x ∈ L, and so every other vector (u, M) ∈ L has length at least
as large. Finally, suppose |ln+1 | ≥ 2. Then

(u, ln+1 M)2 ≥ (0, ln+1 M)2 ≥ (2M)2

and so (u, ln+1 M) ≥ 2(e, M). �

Lemma 18.3.2 shows that the CVP can be reduced to SVP as long as the target vector
is very close to a lattice vector, and assuming one has a good guess M for the distance.
However, when using algorithms such as LLL that solve the approximate SVP it is not
possible, in general, to make rigorous statements about the success of the embedding
technique. As mentioned earlier, the LLL algorithm often works better than the theoretical
analysis predicts. Hence, the embedding technique can potentially be useful even when w
is not so close to a lattice point. For further discussion see Lemma 6.15 of Kannan [296].

Exercise 18.3.3 Let {b1 , . . . , bn } be a basis for a lattice in Rn and let w ∈ Rn . Let M =
max1≤i≤n bi . Show that the output (e, M) of the embedding technique (using LLL) on
the basis of equation (18.3) is the same as the output of the Babai nearest plane algorithm
when run on the LLL-reduced basis.

Exercise 18.3.4 Solve the following CVP instance using the embedding technique and a
computer algebra package.
 
−265 287 56
B = −460 448 72 , w = (100, 80, 100).
−50 49 8

18.4 Enumerating all short vectors


We present a method to enumerate all short vectors in a lattice, given any basis. We will
show later that the performance of this enumeration algorithm depends on the quality of
the lattice basis. Throughout this section, v denotes the Euclidean norm.
The first enumeration method was given by Pohst in 1981. Further variants were given
by Finke and Pohst, Kannan [295, 296], Helfrich [255] and Schnorr and Euchner [473].
These methods are all deterministic and are guaranteed to output a non-zero vector of
minimum length. The time complexity is exponential in the lattice dimension, but the
storage requirements polynomial. This approach is known by the name “sphere decoding”
in the communications community (see [396]).
376 Close and short vectors

Exercise 18.4.1 Let {b1 , . . . , bn } be an (ordered) basis in Rm for a lattice and let
{b∗1 , . . . , b∗n } be the Gram–Schmidt orthogonalisation. Let v ∈ Rm . Show that the projection
of v onto b∗i is
v, b∗i  ∗
b .
b∗i 2 i
n
Show that if v = j =1 xj bj then this projection is
 
 n
xi + xj µj,i  b∗i .
j =i+1

Lemma 18.4.2 Let {b1 , . . . , bn } be an (ordered) basis for a lattice and let {b∗1 , . . . , b∗n }
be the Gram–Schmidt orthogonalisation. Fix A ∈ R>0 and write Bi = b∗i 2 . Let v =
n 2
i=1 xi bi be such that v ≤ A. For 1 ≤ i ≤ n define
n

zi = xi + µj,i xj .
j =i+1

Then for 1 ≤ i < n


n

zi2 Bi ≤ A.
i=1

Proof Exercise 18.4.1 gives a formula zi b∗i for the projection of v onto each b∗i . Since the
vectors b∗i are orthogonal we have
n
 n

2
v = zi b∗i 2 = zi2 Bi .
i=1 i=1

The result follows. �


Theorem 18.4.3 Let the notation be as in Lemma 18.4.2. Then one has xn2 ≤ A/b∗n 2 and,
for 1 ≤ i < n
 2
n  n
xi + µj,i xj  Bi ≤ A − zj2 Bj .
j =i+1 j =i+1

Proof Note that zn = xn and Lemma 18.4.2 implies zn2 Bn ≤ A, which proves the first
statement. The second statement is also just a re-writing of Lemma 18.4.2. �
We now sketch the enumeration algorithm for finding all short lattice vectors v =
n
i=1 xi bi , which follows from the above results. First, without

loss of generality we may
assume that xn ≥ 0. By Theorem 18.4.3 we know 0 ≤ xn ≤ A/Bi . For each candidate xn
one knows that
(xn−1 + µn,n−1 xn )2 Bn−1 ≤ A − xn2 Bn
18.4 Enumerating all short vectors 377

and so
#
|xn−1 + µn,n−1 xn | ≤ (A − xn2 Bn )/Bn−1 .

To phrase this as a bound on xn−1 one uses the fact that, for any a ∈ R, b ∈ R≥0 ,
the solutions x ∈ R to |x + a| ≤ b satisfy −(b + a) ≤ x ≤ b − a. Hence, writing M1 =
#  
A − xn2 Bn /Bn−1 one has

−(M1 + µn,n−1 xn ) ≤ xn−1 ≤ M1 − µn,n−1 xn .

Exercise 18.4.4 Generalise the above discussion to show that for 1 ≤ i < n one has

−(M1 + M2 ) ≤ xi ≤ M1 − M2

where
* 
+
+ n

+
M1 = ,A − xj2 Bj  /Bi
j =i+1

n
and M2 = j =i+1 µj,i xj .

Exercise 18.4.5 Write pseudocode for the algorithm to enumerate all short vectors of a
lattice.

The algorithm to find a non-zero vector of minimal length is then straightforward. Set
A to be b1 2 , enumerate all vectors of length at most A and, for each vector, compute the
length. One is guaranteed to find a shortest vector in the lattice. Schnorr and Euchner [473]
organised the search in a manner to minimise the running time.
The running time of this algorithm depends on the quality of the basis in several ways.
First, it is evidently important to have a good bound A for the length of the shortest vector.
Taking A = b1 2 is only sensible if b1 is already rather short; alternatively one may
n
choose, say, A = 2πe det(L)1/n using the Gaussian heuristic (one can choose a small
bound for A and then, if the search fails, increase A accordingly). Second, one sees that
if b∗n is very short then the algorithm searches a huge range of values for xn , and similarly
if b∗n−1 is very short etc. Hence, the algorithm performs best if the values b∗i  descrease
rather gently.
To solve SVP in practice using enumeration one first performs LLL and other pre-
computation to get a sufficiently nice basis. We refer to Kannan [295, 296], Schnorr and
Euchner [473] and Agrell, Eriksson, Vardy and Zeger [7] for details. The best complexity
statement in the literature is due to Hanrot and Stehlé.

Theorem 18.4.6 (Hanrot and Stehlé [249]) There exists a polynomial p(x, y) ∈ R[x, y]
such that, for any n-dimensional lattice L in Zm with basis consisting of vectors with
coefficients bounded by B, one can compute all the shortest non-zero vectors in L in at
most p(log(B), m)nn/2e+o(n) bit operations.
378 Close and short vectors

Due to lack of space we refer to the original papers for further details about enumeration
algorithms. Pujol and Stehlé [455] give an analysis of issues related to floating-point
implementation.
In practice, the most efficient methods to compute the SVP are heuristic “pruning”
methods. These methods are still exponential in the lattice dimension, and are not guaranteed
to output the shortest vector. The extreme pruning algorithm of Gama, Nguyen and Regev
[227] is currently the most practical method.
A quite different approach, leading to non-deterministic algorithms (in other words, the
output is a non-zero vector in the lattice that, with high probability, has minimal length)
is due to Ajtai, Kumar and Sivakumar (see [322] for a survey). The running time and
storage requirements of the algorithm are both exponential in the lattice dimension. For
some experimental results we refer to Nguyen and Vidick [417]. Micciancio and Voulgaris
[394] have given an improved algorithm, still requiring exponential time and storage.

18.4.1 Enumeration of closest vectors


The above ideas can be adapted to list lattice points close to some w ∈ Rn . Let A ∈ R>0 and

suppose we seek all v ∈ L such that v − w2 ≤ A. Write v = ni=1 xi bi = ni=1 zi b∗i as
before and write
n

w= yi b∗i .
i=1

2
Then v − w ≤ A is equivalent to
n

(zi − yi )2 b∗i 2 ≤ A.
i=1

It follows that

yn − A/Bn ≤ xn ≤ yn + A/Bn

and so on.

Lemma 18.4.7 Let the notation be as above and define


* 
+
+ n
+
Mi = ,A − zj2 Bj  /Bi
j =i+1


and Ni = nj=i+1 µj,i xj for 1 ≤ i ≤ n. If v = ni=1 xi bi satisfies v − w2 ≤ A then,
for 1 ≤ i ≤ n

yi − Mi − Ni ≤ xi ≤ yi + Mi − Ni .

Exercise 18.4.8 Prove Lemma 18.4.7.


18.5 Korkine–Zolotarev bases 379

The paper by Agrell, Eriksson, Vardy and Zeger [7] gives an excellent survey and
comparison of the various enumeration techniques. They conclude that the Schnorr-Euchner
variant is much more efficient than the Pohst or Kannan versions.

18.5 Korkine–Zolotarev bases


We present a notion of reduced lattice basis that has better properties than an LLL-reduced
basis.
Definition 18.5.1 Let L be a lattice of rank n in Rm . An ordered basis {b1 , . . . , bn } for L
is Korkine–Zolotarev reduced1 if
1. b1 is a non-zero vector of minimal length in L;
2. |µi,1 | < 1/2 for 2 ≤ i ≤ n;
3. the basis {b2 − µ2,1 b1 , . . . , bn − µn,1 b1 } is Korkine–Zolotarev reduced (this is the
orthogonal projection of the basis of L onto the orthogonal complement of b1 )
where b∗i is the Gram–Schmidt orthogonalisation and µi,j = bi , b∗j /b∗j , b∗j .
One problem is that there is no known polynomial-time algorithm to compute a Korkine–
Zolotarev basis.
Theorem 18.5.2 Let {b1 , . . . , bn } be a Korkine–Zolotarev reduced basis of a lattice L.
Then
1. for 1 ≤ i ≤ n
4 2 i+3 2
λi ≤ bi 2 ≤ λ ;
i+3 4 i
2.
n
 n

i+3
2
bi  ≤ γnn det(L)2 .
i=1 i=1
4

Proof See Theorem 2.1 and 2.3 of Lagarias, Lenstra and Schnorr [324]. �
As we have seen, for lattices of relatively small dimension it is practical to enumerate
all short vectors. Hence, one can compute a Korkine–Zolotarev basis for lattices of small
dimension. Schnorr has developed the block Korkine–Zolotarev lattice basis reduction
algorithm, which computes a Korkine–Zolotarev basis for small-dimensional projections
of the original lattice and combines this with the LLL algorithm. The output basis can
be proved to be of a better quality than an LLL-reduced basis. This is the most powerful
algorithm for finding short vectors in lattices of large dimension. Due to lack of space we
are unable to present this algorithm; we refer to Schnorr [468] for details.

1
Some authors also call it Hermite–Korkine–Zolotarev (HKV) reduced.
19
Coppersmith’s method and related applications

An important application of lattice basis reduction is finding small solutions to polynomial


equations F (x) ≡ 0 (mod M) of degree d > 1. The main purpose of this chapter is to
present some results of Coppersmith [131] on this problem. We also discuss finding small
roots of bivariate integer polynomials and some other applications of these ideas.
In general, finding solutions to modular equations is easy if we know the factorisation of
the modulus, see Section 2.12. However, if the factorisation of the modulus M is not known
then finding solutions can be hard. For example, if we can find a solution to x 2 ≡ 1 (mod M)
that is not x = ±1 then we can split M. Hence, we do not expect efficient algorithms for
finding all solutions to modular equations in general.
Suppose then that the polynomial equation has a “small” solution. It is not so clear that
finding the roots is necessarily a hard problem. The example x 2 ≡ 1 (mod M) no longer √
gives any intuition since the two non-trivial roots both have absolute value at least M.
As we will explain in this chapter, if F (x) ≡ 0 (mod M) of degree d has a solution x0 such
that |x0 | < M 1/d− for small  > 0 then it can be found in polynomial-time. This result has
a number of important consequences.
General references for the contents of this chapter are Coppersmith [131, 132], May [370,
371], Nguyen and Stern [415] and Nguyen [409].

19.1 Coppersmith’s method for modular univariate polynomials


19.1.1 First steps to Coppersmith’s method
We sketch the basic idea of the method, which goes back to Håstad. Let F (x) = x d +
ad−1 x d−1 + · · · + a1 x + a0 be a monic polynomial of degree d with integer coefficients.
Suppose we know that there exist one or more integers x0 such that F (x0 ) ≡ 0 (mod M)
and that |x0 | < M 1/d . The problem is to find all such roots.
Since |x0i | < M for all 0 ≤ i ≤ d then, if the coefficients of F (x) are small enough, one
might have F (x0 ) = 0 over Z. The problem of finding integer roots of integer polynomials
is easy: we can find roots over R using numerical analysis (e.g., Newton’s method) and
then round the approximations of the roots to the nearest integer and check whether they
are solutions of F (x).

380
19.1 Modular univariate polynomials 381

The problem is therefore to deal with polynomials F (x) having a small solution but
whose coefficients are not small. Coppersmith’s idea (in the formulation of Howgrave-
Graham [268]) is to build from F (x) a polynomial G(x) that still has the same solution x0 ,
but which has coefficients small enough that the above logic does apply.
Example 19.1.1 Let M = 17 · 19 = 323 and let
F (x) = x 2 + 33x + 215.
We want to find the small solution to F (x) ≡ 0 (mod M) (in this case, x0 = 3 is a solution,
but note that F (3)
= 0 over Z).
We seek a related polynomial with small coefficients. For this example
G(x) = 9F (x) − M(x + 6) = 9x 2 − 26x − 3
satisfies G(3) = 0. This root can be found using Newton’s method over R (or even the
quadratic formula).
We introduce some notation for the rest of this section. Let M, X ∈ N and let F (x) =
d
i=0 ai x ∈ Z[x]. Suppose x0 ∈ Z is a solution to F (x) ≡ 0 (mod M) such that |x0 | < X.
i

We associate with the polynomial F (x) the row vector


bF = (a0 , a1 X, a2 X2 , · · · , ad Xd ). (19.1)
Vice versa, any such row vector corresponds to a polynomial. Throughout this section we
will interpret polynomials as row vectors, and row vectors as polynomials, in this way.
Theorem 19.1.2 (Howgrave-Graham [268]) Let F (x), X, M, bF be as above (i.e., there

is some x0 such that |x0 | ≤ X and F (x0 ) ≡ 0 (mod M)). If bF  < M/ d + 1 then
F (x0 ) = 0.

Proof Recall the Cauchy–Schwarz inequality ( ni=1 xi yi )2 ≤ ( ni=1 xi2 )( ni=1 yi2 ) for
xi , yi ∈ R. Taking xi ≥ 0 and yi = 1 for 1 ≤ i ≤ n one has
*
n + n
 + 
xi ≤ ,n xi2 .
i=1 i=1

Now
- d -
- -  d d
- i-
|F (x0 )| = - ai x0 - ≤ |ai ||x0 |i ≤ |ai |Xi
- -
i=0 i=0 i=0
√ √ √
≤ d + 1bF  < d + 1M/ d + 1 = M
where the third inequality is Cauchy–Schwarz, so −M < F (x0 ) < M. But F (x0 ) ≡
0 (mod M) and so F (x0 ) = 0. �
d
Let F (x) = i=0 ai x i be a monic polynomial. We assume that F (x) has at least one
solution x0 modulo M such that |x0 | < X for some specified integer X. If F (x) is not
382 Coppersmith’s method and related applications

monic but gcd(ad , M) = 1 then one can multiply F (x) by ad−1 (mod M) to make it monic.
If gcd(ad , M) > 1 then one can split M and reduce the problem to two (typically easier)
problems. As explained above, to find x0 it will be sufficient to find a polynomial G(x) with
the same root x0 modulo M but with sufficiently small coefficients.
To do this, consider the d + 1 polynomials Gi (x) = Mx i for 0 ≤ i < d and F (x). They
all have the solution x = x0 modulo M. Define the lattice L with basis corresponding to
these polynomials (by associating with a polynomial the row vector in equation (19.1)).
Therefore, the basis matrix for the lattice L is
 
M 0 ··· 0 0
 0 MX · · · 0 0
 
 .. .
.. ..  .
B= . .  (19.2)
 
0 0 ··· MX d−1
0 
a0 a1 X · · · ad−1 Xd−1 Xd
Every element of this lattice is a row vector that can be interpreted as a polynomial F (x)
(via equation (19.1) such that F (x0 ) ≡ 0 (mod M).

Lemma 19.1.3 The dimension of the lattice L defined in equation (19.2) above is d + 1
and the determinant is

det(L) = M d Xd(d+1)/2 .

Exercise 19.1.4 Prove Lemma 19.1.3.

One now runs the LLL algorithm on this (row) lattice basis. Let G(x) be the polynomial
corresponding to the first vector b1 of the LLL-reduced basis (since every row of B has the
form of equation (19.1) then so does b1 ).

Theorem 19.1.5 Let the notation be as above and let G(x) be the polynomial corresponding
to the first vector in the LLL-reduced basis for L. Set c1 (d) = 2−1/2 (d + 1)−1/d . If X <
c1 (d)M 2/d(d+1) then any root x0 of F (x) modulo M such that |x0 | ≤ X satisfies G(x0 ) = 0
in Z.

Proof Recall that b1 satisfies

b1  ≤ 2(n−1)/4 det(L)1/n = 2d/4 M d/(d+1) Xd/2 .



For b1 to satisfy the conditions of Howgrave-Graham’s theorem (i.e., b1  < M/ d + 1)
it is sufficient that

2d/4 M d/(d+1) Xd/2 < M/ d + 1.

This can be written as



d + 12d/4 Xd/2 < M 1/(d+1) ,

which is equivalent to the condition in the statement of the theorem. �


19.1 Modular univariate polynomials 383

In other words, if d = 2 then it is sufficient that X ≈ M 1/3 to find the small solution
using the above method. If d = 3 then it is sufficient that X ≈ M 1/6 . This is the result of
Håstad. Of course, LLL often works better than the worst-case bound, so small solutions
x0 may be found even when x0 does not satisfy the condition of the Theorem.

Example 19.1.6 Let M = 10001 and consider the polynomial

F (x) = x 3 + 10x 2 + 5000x − 222.

One can check that F (x) is irreducible, and that F (x) has the small solution x0 = 4 modulo
M. Note that |x0 | < M 1/6 so one expects to be able to find x0 using the above method.
Suppose X = 10 is the given bound on the size of x0 . Consider the basis matrix
 
M 0 0 0
 0 MX 0 0
B=  0
.
0 MX2 0 
−222 5000X 10X2 X3

Running LLL on this matrix gives a reduced basis, the first row of which is

(444, 10, −2000, −2000).

The polynomial corresponding to this vector is

G(x) = 444 + x − 20x 2 − 2x 3 .

Running Newton’s root-finding method on G(x) gives the solution x0 = 4.

19.1.2 The full Coppersmith method


The method in the previous section allows one to find small roots of modular poly-
nomials, but it can be improved further. Looking at the proof of Theorem 19.1.5 one
sees that the requirement
√ for success is essentially det(L) < M n (more precisely, it is
2d/4 M d/(d+1) Xd/2 < M/ d + 1). There are two strategies to extend the utility of the
method (i.e., to allow bigger values for X). The first is to increase the dimension n by
adding rows to L that contribute less than M to the determinant. The second is to increase
the power of M on the right hand side. One can increase the dimension without increasing
the power of M by using the so-called “x-shift” polynomials xF (x), x 2 F (x), . . . , x k F (x);
Example 19.1.7 gives an example of this. One can increase the power of M on the right hand
side by using powers of F (x) (since if F (x0 ) ≡ 0 (mod M) then F (x0 )k ≡ 0 (mod M k )).

Example 19.1.7 Consider the problem of Example 19.1.6. The lattice has dimension 4 and
determinant M 3 X3 . The condition for LLL to output a sufficiently small vector is
 1/4 M
23/4 M 3 X6 ≤√ ,
4
384 Coppersmith’s method and related applications

which, taking M = 10001, leads to X ≈ 2.07. (Note that the method worked for a larger
value of x0 ; this is because the bound used on LLL only applies in the worst case.)
Consider instead the basis matrix that also includes rows corresponding to the polyno-
mials xF (x) and x 2 F (x)
 
M 0 0 0 0 0
 0 MX 0 0 0 0
 
 2 
 0 0 MX 0 0 0
B= 2 3 .
−222 5000X 10X X 0 0
 
 0 −222X 5000X2 10X3 X4 0
0 0 −222X2 5000X3 10X4 X5

The dimension is 6 and the determinant is M 3 X15 . The condition for LLL to output a
sufficiently small vector is
 1/6 M
25/4 M 3 X15 ≤√ ,
6
which leads to X ≈ 3.11. This indicates that some benefit can be obtained by using x-shifts.

Exercise 19.1.8 Let G(x) be a polynomial of degree d. Show that taking d x-shifts
G(x), xG(x), . . . , x d−1 G(x) gives a method that works for X ≈ M 1/(2d−1) .

Exercise 19.1.8 shows that when d = 3 we have improved the result from X ≈ M 1/6 to
X ≈ M 1/5 . Coppersmith [131] exploits both x-shifts and powers of F (x). We now present
the method in full generality.

Theorem 19.1.9 (Coppersmith) Let 0 <  < min{0.18, 1/d}. Let F (x) be a monic poly-
nomial of degree d with one or more small roots x0 modulo M such that |x0 | < 12 M 1/d− .
Then x0 can be found in time, bounded by a polynomial in d, 1/ and log(M).

Proof Let h > 1 be an integer that depends on d and  and will be determined in equation
(19.3) below. Consider the lattice L corresponding (via the construction of the previous
section) to the polynomials Gi,j (x) = M h−1−j F (x)j x i for 0 ≤ i < d, 0 ≤ j < h. Note
that Gi,j (x0 ) ≡ 0 (mod M h−1 ). The dimension of L is dh. One can represent L by a lower
triangular basis matrix with diagonal entries M h−1−j Xj d+i . Hence, the determinant of L is

det(L) = M (h−1)hd/2 X(dh−1)dh/2 .

Running LLL on this basis outputs an LLL-reduced basis with first vector b1 satisfying

b1  < 2(dh−1)/4 det(L)1/dh = 2(dh−1)/4 M (h−1)/2 X(dh−1)/2 .

This vector corresponds to a polynomial


√ G(x) of degree dh − 1 such that G(x0 ) ≡
0 (mod M h−1 ). If b1  < M h−1 / dh then Howgrave-Graham’s result applies and we
have G(x0 ) = 0 over Z.
19.1 Modular univariate polynomials 385

Hence, it is sufficient that



dh2(dh−1)/4 M (h−1)/2 X(dh−1)/2 < M h−1 .

Rearranging gives

dh2(dh−1)/4 X(dh−1)/2 < M (h−1)/2 ,

which is equivalent to

c(d, h)X < M (h−1)/(dh−1)


√ √
where c(d, h) = ( dh2(dh−1)/4 )2/(dh−1) = 2(dh)1/(dh−1) .
Now
h−1 1 d −1
= − .
dh − 1 d d(dh − 1)
Equating (d − 1)/(d(dh − 1)) =  gives

h = ((d − 1)/(d) + 1)/d ≈ 1/(d). (19.3)



Note that dh√= 1 + (d − 1)/(d) and so c(d, h) = 2(1 + (d − 1)/(d))d/(d−1) , which
converges to 2 as  → 0. Since X < 12 M 1/d− we require 12 ≤ c(d,h) 1
. Writing x =

d/(d − 1) this is equivalent to (1 + 1/x) ≤ 2, which holds for 0 ≤ x ≤ 0.18.
x

Rounding h up to the next integer gives a lattice such that if

|x0 | < 12 M 1/d−

then the LLL algorithm and polynomial root finding leads to x0 .


Since the dimension of the lattice is dh ≈ 1/ and the coefficients of the polynomials
Gi,j are bounded by M h it follows that the running time of LLL depends on d, 1/ and
log(M). �

Exercise 19.1.10 Show that the precise complexity of Coppersmith’s method is


O((1/)9 log(M)3 ) bit operations (recall that 1/ > d). Note that if one fixes d and 
and considers the problem as M tends to infinity then one has a polynomial-time algorithm
in log(M).

We refer to Section 3 of [132] for some implementation tricks that improve the algorithm.
For example, one can add basis vectors to the lattice corresponding to polynomials of the
form M h−1 x(x − 1) · · · (x − i + 1)/ i!.

Example 19.1.11 Let p = 230 + 3, q = 232 + 15 and M = pq. Consider the polynomial

F (x) = a0 + a1 x + a2 x 2 + a3 x 3
= 1942528644709637042 + 1234567890123456789x
+ 987654321987654321x 2 + x 3 ,
386 Coppersmith’s method and related applications

which has a root x0 modulo M such that |x0 | ≤ 214 . Set X = 214 . Note that X ≈ M 1/4.4 .
One can verify that the basic method in Section 19.1.1 does not find the small root.
Consider the basis matrix (this is of smaller dimension than the lattice in the proof of
Theorem 19.1.9 in the case d = 3 and h = 3)
 2 
M 0 0 0 0 0 0
 
 0 M 2X 0 0 0 0 0
 
 0 0 M 2 X2 0 0 0 0
 
 
Ma0 Ma1 X 2 3
0 0 0
 Ma2 X MX .
 2 3 4 
 0 Ma0 X Ma1 X Ma2 X MX 0 0
 
 0 0 Ma0 X2 Ma1 X3 Ma2 X4 MX5 0
 
2 2 2 3 2 4 5 6
a0 2a0 a1 X (a1 + 2a0 a2 )X (2a0 + 2a1 a2 )X (a2 + 2a1 )X 2a2 X X

The dimension is 7 and the determinant is M 9 X21 . The first vector of the LLL-reduced
basis is

(−369928294330603367352173305173409792,
1451057442025994832259962670402797568, . . . ).

This corresponds to the polynomial

−369928294330603367352173305173409792
+ 88565517701781911148679362207202x − 3439987357258441728608570659x 2
+ 446358057645551896819258x 3 + 4564259979987386926x 4
− 1728007960413053x 5 − 21177681998x 6 ,

which has x0 = 16384 = 214 as a real root.

Exercise 19.1.12 Let M = (220 + 7)(221 + 17) and F (x) = x 3 + (225 − 2883584)x 2 +
46976195x + 227. Use Coppersmith’s algorithm to find an integer x0 such that |x0 | < 29
and F (x0 ) ≡ 0 (mod M).

Remark 19.1.13 It is natural to wonder whether one can find roots right up to the limit
X = M 1/d . Indeed, the − term can be eliminated by performing an exhaustive search over
the top few bits of the root x0 . An alternative way to proceed is to set  = 1/ log2 (M), break
the range |x0 | < M 1/d of size 2M 1/d into M 2 = 4 intervals of size 2M 1/d−2 = M 1/d−
and perform Coppersmith’s algorithm for each subproblem in turn.

Another question is whether one can go beyond the boundary X = M 1/d . A first obser-
vation is that for X > M 1/d one does not necessarily expect a constant number of solutions;
see Exercise 19.1.14. Coppersmith [132] gives further arguments why M 1/d is the best one
can hope for.
19.3 Bivariate integer polynomials 387

Exercise 19.1.14 Let M = p2 and consider F (x) = x 2 + px. Show that if X = M 1/2+
where 0 <  < 1/2 then the number of solutions |x| < X to F (x) ≡ 0 (mod M) is 2M  ,

Exercise 19.1.15 Let N = pq be a product of two primes of similar size and let e ∈ N
be a small integer such that gcd(e, ϕ(N )) = 1. Let 1 < a, y < N be such that there is an
integer 0 ≤ x < N 1/e satisfying (a + x)e ≡ y (mod N ). Show that, given N, e, a, y, one
can compute x in polynomial-time.

19.2 Multivariate modular polynomial equations


Suppose one is given F (x, y) ∈ Z[x, y] and integers X, Y and M and is asked to find
one or more roots (x0 , y0 ) to F (x, y) ≡ 0 (mod M) such that |x0 | < X and |y0 | < Y .
One can proceed using similar ideas to the above, hoping to find two polynomials
F1 (x, y), F2 (x, y) ∈ Z[x, y] such that F1 (x0 , y0 ) = F2 (x0 , y0 ) = 0 over Z, and such that
the resultant Rx (F1 (x, y), F2 (x, y))
= 0 (i.e., that F1 (x, y) and F2 (x, y) are algebraically
independent). This yields a heuristic method in general, since it is hard to guarantee the
independence of F1 (x, y) and F2 (x, y).

Theorem 19.2.1 Let F (x, y) ∈ Z[x, y] be a polynomial of total degree d (i.e., every
monomial x i y j satisfies i + j ≤ d). Let X, Y, M ∈ N be such that XY < M 1/d− for some
0 <  < 1/d. Then one can compute (in time polynomial in log(M) and 1/ > d) polyno-
mials F1 (x, y), F2 (x, y) ∈ Z[x, y] such that, for all (x0 , y0 ) ∈ Z2 with |x0 | < X, |y0 | < Y
and F (x0 , y0 ) ≡ 0 (mod M), one has F1 (x0 , y0 ) = F2 (x0 , y0 ) = 0 over Z.

Proof We refer to Jutla [290] and Section 6.2 of Nguyen and Stern [415] for a sketch of
the details. �

19.3 Bivariate integer polynomials


We now consider F (x, y) ∈ Z[x, y] and seek a root (x0 , y0 ) ∈ Z2 such that both |x0 | and
|y0 | are small. Coppersmith has proved the following important result.

Theorem 19.3.1 Let F (x, y) ∈ Z[x, y] and let d ∈ N be such that degx (F (x, y)),
degy (F (x, y)) ≤ d. Write

F (x, y) = Fi,j x i y j .
0≤i,j ≤d

For X, Y ∈ N define

W = max |Fi,j |Xi Y j .


0≤i,j,≤d
388 Coppersmith’s method and related applications

If XY < W 2/(3d) then there is an algorithm that takes as input F (x, y), X, Y , runs in
time (bit operations) bounded by a polynomial in log(W ) and 2d and outputs all pairs
(x0 , y0 ) ∈ Z2 such that F (x0 , y0 ) = 0, |x0 | ≤ X and |y0 | ≤ Y .

The condition in Theorem 19.3.1 is somewhat self-referential. If one starts with a


polynomial F (x, y) and bounds X and Y on the size of roots, then one can compute W and
determine whether or not the algorithm will succeed in solving the problem.

Proof (Sketch) There are two proofs of this theorem, both of which are rather technical. The
original by Coppersmith can be found in [131]. We sketch a simpler proof by Coron [139].
As usual, we consider shifts of the polynomial F (x, y). Choose k ∈ N (sufficiently large)
and consider the k 2 polynomials

sa,b (x, y) = x a y b F (x, y) for 0 ≤ a, b < k

in the (d + k)2 monomials x i y j with 0 ≤ i, j < d + k. Coron chooses a certain set of k 2


monomials (specifically of the form x i0 +i y j0 +j for 0 ≤ i, j < k and fixed 0 ≤ i0 , j0 ≤ d)
and obtains a k 2 × k 2 matrix S with non-zero determinant M. (The most technical part of
[139] is proving that this can always be done and bounding the size of M.)
One can now consider the (d + k)2 polynomials Mx i y j for 0 ≤ i, j < d + k. Writing
each polynomial as a row vector of coefficients, we now have a k 2 + (d + k)2 by (d + k)2
matrix. One can order the rows such that the matrix is of the form
 
S ∗
MIk2 0 
0 MIw

where w = (d + k)2 − k 2 , ∗ represents a k 2 × w matrix, and Iw denotes the w × w identity


matrix.
Now, since M = det(S) there exists an integer matrix S  such that S  S = MIk2 . Perform
the row operations
    
Ik2 0 0 S ∗ S ∗
−S  Ik2 0  MIk2 0  = 0 T 
0 0 Iw 0 MIw 0 MIw

for some k 2 × w matrix T . Further row operations yield a matrix of the form
 
S ∗
0 T 
0 0

for some w × w integer matrix T  .


Coron considers a lattice L corresponding to T  (where the entries in a column corre-
sponding to monomial x i y j are multiplied by Xi Y j as in equation (19.2)) and computes
19.3 Bivariate integer polynomials 389

the determinant of this lattice. Lattice basis reduction yields a short vector that corresponds
to a polynomial G(x, y) with small coefficients such that every root of F (x, y) is a root
of G(x, y) modulo M. If (x0 , y0 ) is a sufficiently small solution to F (x, y) then, using an
analogue of Theorem 19.1.2, one infers that G(x0 , y0 ) = 0 over Z.
A crucial detail is that G(x, y) has no common factor with F (x, y). To show this suppose
G(x, y) = F (x, y)A(x, y) for some polynomial (we assume that F (x, y) is irreducible, if

not then apply the method to its factors in turn). Then G(x, y) = 0≤i,j <k Ai,j x i y j F (x, y)
and so the vector of coefficients of G(x, y) is a linear combination of the coefficient vectors
of the k 2 polynomials sa,b (x, y) for 0 ≤ a, b < k. But this vector is also a linear combination
of the rows of the matrix (0 T  ) in the original lattice. Considering the first k 2 columns
(namely the columns of S) one has a linear dependence of the rows in S. Since det(S)
= 0
this is a contradiction.
It follows that the resultant Rx (F, G) is a non-zero polynomial, and so one can find all
solutions by finding the integer roots of Rx (F, G)(y) and then solving for x.
To determine the complexity it is necessary to compute the determinant of T  and
to bound M. Coron shows that the method works if XY < W 2/(3d)−1/k 2−9d . To get the
stated running time for XY < W 2/(3d) Coron proposes setting k = log(W ) and perform-
ing exhaustive search on the O(d) highest-order bits of x0 (i.e., running the algorithm a
polynomial in 2d times). �

Example 19.3.2 Consider F (x, y) = axy + bx + cy + d = 127xy − 1207x − 1461y +


21 with X = 30, Y = 20. Let M = 1274 (see below).
Consider the 13 × 9 matrix (this is taking k = 2 in the above proof and introducing the
powers X i Y j from the start)
 
aX2 Y 2 bX2 Y cXY 2 dXY 0 0 0 0 0
 0 aX2 Y 0 cXY bX2 0 dX 0 0
 
 0 0 aXY 2
bXY 0 cY 2
0 dY 0 
 
 
 0 0 0 aXY 0 0 bX cY d 

B = MX2 Y 2 
0 0 0 0 0 0 0 0 .
 
 0
 MX2 Y 0 0 0 0 0 0 0 
 .. .. 
 . .
0 0 0 0 0 0 0 0 M

We take S to be the matrix


 
a b c d
0 a 0 c
 
0 0 a b
0 0 0 a

corresponding to the monomials x i0 +i y j0 +j for 0 ≤ i, j < 2 and fixed i0 = j0 = 1. Note


that M = det(S) = a 4 = 1274 .
390 Coppersmith’s method and related applications

Rather than diagonalising using the method of the proof of Theorem 19.3.1 we compute
the Hermite normal form of B. This gives the matrix
 2 2 
aX Y ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
 aX 2 Y ∗ ∗ ∗ ∗ ∗ ∗ ∗ 
 aXY 2 ∗ ∗ ∗ ∗ ∗ ∗ 
 
 aXY ∗ ∗ ∗ ∗ ∗ 
 
 16129X 2 16129Y 2 100125X 1064641Y 202558777
 
  2048383Y 2 ∗ ∗ ∗ 
B = 2048383X ∗ ∗

 
 260144641Y ∗ 
 
 260144641
 
 
 

where blanks are zeroes and ∗ denotes an entry whose value we do not bother to write
down. Let L be the 5 × 5 diagonal matrix formed of columns 5 to 9 of rows 5 to 9 of B  .
Performing LLL-reduction on L gives a matrix whose first row is

(−16129X2 , −16129Y 2 , 1048258X, 983742Y, −28446222)

corresponding to the polynomial

G(x, y) = −16129x 2 − 16129y 2 + 1048258x + 983742y − 28446222.

Clearly, G(x, y) is not a multiple of F (x, y) since it has no xy term. Computing resultants
and factoring gives the solutions (x, y) = (21, 21) and (23, 19).

Exercise 19.3.3 The polynomial

F (x, y) = 131xy − 1400x + 20y − 1286

has an integer solution with |x| < 30 and |y| < 20. Use Coron’s method as in Exam-
ple 19.3.2 to find (x, y).

The results of this section can be improved by taking into account the specific shape of
the polynomial F (x, y). We refer to Blömer and May [67] for details.
Finally, we remark that results are also known for integer polynomials having three
or more variables, but these are heuristic in the sense that the method produces a list of
polynomials having small roots in common, but there is no guarantee that the polynomials
are algebraically independent.

19.4 Some applications of Coppersmith’s method


19.4.1 Fixed padding schemes in RSA
As discussed in Chapter 1, it is necessary to use padding schemes for RSA encryption (for
example, to increase the length of short messages and to prevent algebraic relationships
19.4 Some applications of Coppersmith’s method 391

between the messages and ciphertexts). One simple proposal for κ-bit RSA moduli is to
take a κ  bit message and pad it by putting (κ − κ  − 1) ones to the left-hand side of it.
This brings a short message to full length. This padding scheme is sometimes called fixed
pattern padding; we discuss it further in Section 24.4.5.
Suppose short messages (for example, 128-bit AES keys K) are being encrypted using
this padding scheme with κ = 1024. Then

m = 21024 − 2128 + K.

Suppose also that the encryption exponent is e = 3. Then the ciphertext is

c = m3 (mod N ).

If such a ciphertext is intercepted then the cryptanalyst only needs to find the value for
K. In this case, we know that K is a solution to the polynomial

F (x) = (21024 − 2128 + x)3 − c ≡ 0 (mod N ).

This is a polynomial of degree 3 with a root modulo N of size at most N 128/1024 = N 1/8 .
So Coppersmith’s method finds the solution K in polynomial-time.

Example 19.4.1 Let N = 8873554201598479508804632335361 (which is a 103 bit inte-


ger) and suppose Bob is sending 10-bit keys K to Alice using the padding scheme
m = 2100 − 210 + K.
Suppose we have intercepted the ciphertext c = 8090574557775662005354455491076
and wish to find K. Let X = 210 . We write F (x) = (x + 2100 − 210 )3 − c = x 3 + a2 x 2 +
a1 x + a0 and define
 
N 0 0 0
 0 NX 0 0
B= 0
.
0 NX 2
0
a0 a1 X a2 X2 X3
Performing lattice reduction and taking the first row vector gives the polynomial with
factorisation

(x − 987)(−920735567540915376297 + 726745175435904508x + 277605904865853x 2 ).

One can verify that the message is K = 987.

19.4.2 Factoring N = pq with partial knowledge of p


Let N = pq and suppose we are given an approximation p̃ to p such that p = p̃ + x0
where |x0 | < X. For example, suppose p is a 2κ-bit prime and p̃ is an integer that has the
same κ most significant bits as p (so that |p − p̃| < 2κ ). Coppersmith used his ideas to get
an algorithm for finding p given N and p̃. Note that Coppersmith originally used a bivariate
392 Coppersmith’s method and related applications

polynomial method, but we present a simpler version following work of Howgrave-Graham,


Boneh, Durfee and others.
The polynomial F (x) = (x + p̃) has a small solution modulo p. The problem is that we
do not know p, but we do know a multiple of p (namely, N ). The idea is to form a lattice
corresponding to polynomials that have a small root modulo p and to apply Coppersmith’s
method.
Theorem 19.4.2 Let N = pq with p < q < 2p. Let 0 <  < 1/4, and suppose p̃ ∈ N is
such that |p − p̃| ≤ 2√1 2 N 1/4− . Then given N and p̃ one can factor N in time polynomial
in log(N ) and 1/.
√ √
Proof Write F (x) = (x + p̃) and note that N/2 ≤ p ≤ N . Let X =  2√1 2 N 1/4− .
We describe the lattice to be used. Let h ≥ 4 be an integer to be determined later and let
k = 2h. Consider the k + 1 polynomials
N h , N h−1 F (x), N h−2 F (x)2 , . . . , N F (x)h−1 , F (x)h , xF (x)h , . . . , x k−h F (x)h .
Note that if p = p̃ + x0 and if G(x) is one of these polynomials then G(x0 ) ≡ 0 (mod ph ).
Consider the lattice corresponding to the above polynomials. More precisely, a basis
for the lattice is obtained by taking each polynomial G(x) above and writing the vector of
coefficients of the polynomial G(x) as in equation (19.1). The lattice has dimension k + 1
and determinant N h(h+1)/2 Xk(k+1)/2 .
Applying LLL gives a short √ vector and, to apply Howgrave-Graham’s result, we
1/(k+1) 1/2
need
√ 2 k/4
det(L) < p h
/ k + 1. Hence, since p > (N/2) , it is sufficient that
k+12 N X < (N/2) . Rearranging gives
k/4 h(h+1)/(2(k+1)) k/2 h/2

X < N h/k−h(h+1)/(k(k+1)) 2−h/k 2−1/2 /(k + 1)1/k .



Since k ≥ 7 we have (k + 1)1/k = 2log2 (k+1)/k ≤ 21/2 and so 1/(k + 1)1/k ≥ 1/ 2.
Now, since k = 2h we find that the result holds if
1
X < N 1/2(1−(h+1)/(2h+1)) √ .
2 2
Since 1/2(1 − (h + 1)/(2h + 1)) = 1/4 − 1/(4(2h + 1)) the result will follow if
1/(4(2h + 1)) < . Taking h ≥ max{4, 1/(4)} is sufficient. �
One can obtain a more general version of Theorem 19.4.2. If p = N α and |x| ≤ N β
where 0 < α, β < 1 then, ignoring constants, the required condition in the proof is
h(h + 1) βk(k + 1)
+ < αh(k + 1).
2 2

Taking h = βk and simplifying gives β < α 2 . The case we have shown is α = 1/2 and
β < 1/4. For details see Exercise 19.4.5 or Theorem 6 and 7 of [370].
Example 19.4.3 Let N = 16803551, p̃ = 2830 and X = 10.
Let F (x) = (x + p̃) and consider the polynomials N, F (x), xF (x) = (x 2 + p̃x) and
2
x F (x), which all have the same small solution x0 modulo p.
19.4 Some applications of Coppersmith’s method 393

We build the lattice corresponding to these polynomials (with the usual method of
converting a polynomial into a row vector). This lattice has basis matrix
 
N 0 0 0
 p̃ X 0 0
 .
 0 p̃X X 2 0
0 0 p̃X2 X3
The first row of the output of the LLL algorithm on this matrix is (105, −1200, 800, 1000),
which corresponds to the polynomial

G(x) = x 3 + 8x 2 − 120x + 105.

The polynomial has the root x = 7 over Z. We can check that p = p̃ + 7 = 2837 is a factor
of N .

Exercise 19.4.4 Let N = 22461580086470571723189523 and suppose you are given the
approximation p̃ = 2736273600000 to p, which is correct up to a factor 0 ≤ x < X =
50000. Find the prime factorisation of N using Coppersmith’s method.

Exercise 19.4.5 Let  > 0. Let F (x) be a polynomial of degree d such that F (x0 ) ≡
2
0 (mod M) for some M | N , M = N α and |x0 | ≤ 12 N α /d− . Generalise the proof of The-
orem 19.4.2 to show that given F (x) and N one can compute x0 in time polynomial in
log(N ), d and 1/.

Exercise 19.4.6 Coppersmith showed that one can factor N in time polynomial in log(N )
given p̃ such that |p − p̃| < N 1/4 . Prove this result.

Exercise 19.4.7 Use Coppersmith’s method to give an integer factorisation algorithm


requiring Õ(N 1/4 ) bit operations. (A factoring algorithm with this complexity was also
given in Section 12.5.)

Exercise 19.4.8 Show that the method of this section also works if given p̃ such that
|p̃ − kp| < N 1/4 for some integer k such that gcd(k, N ) = 1.

Exercise 19.4.9 Coppersmith also showed that one can factor N in time polynomial in
log(N ) given p̃ such that p ≡ p̃ (mod M) where M > N 1/4 . Prove this result.

Exercise 19.4.10 Let N = pq with p ≈ q. Show that if one knows half the high order bits
of p then one also knows approximately half the high order bits of q as well.

19.4.3 Factoring p r q
As mentioned in Section 24.1.2, moduli of the form pr q, where p and q are distinct primes
and r ∈ N, can be useful for some applications. When r is large then p is relatively small
compared with N and so a natural attack is to try to factor N using the elliptic curve method.
394 Coppersmith’s method and related applications

Boneh, Durfee and Howgrave-Graham [74] considered using Coppersmith’s method to


factor integers of the form N = p r q when r is large. They observed that if one knows r
and an approximation p̃ to p then there is a small root of the polynomial equation
F (x) = (p̃ + x)r ≡ 0 (mod p r )
and that p r is a large factor of N . One can therefore apply the technique of Section 19.4.2.
The algorithm is to repeat the above for all p̃ in a suitably chosen set. An analysis of the
complexity of the method is given in [74]. It is shown that if r ≥ log(p) then the algorithm
runs in polynomial-time and that if r = log2 (p) then the algorithm is asymptotically
faster than using the elliptic curve method. One specific example mentioned in [74] is that
if p, q ≈ 2512 and r = 23 then N = pr q should be factored more quickly by their method
than with the elliptic curve method.
When r is small it is believed that moduli of the form N = pr q are still hard to factor.
For 3076 bit moduli, taking r = 3 and p, q ≈ 2768 should be such that the best-known
attack requires at least 2128 bit operations.
Exercise 19.4.11 The integer 876701170324027 is of the form p3 q where |p − 5000| <
10. Use the method of this section to factor N .

19.4.4 Chinese remaindering with errors


Boneh [70], building on work of Goldreich, Ron and Sudan, used ideas very similar to
Coppersmith’s method to give an algorithm for the following problem in certain cases.
Definition 19.4.12 Let X, p1 , . . . , pn , r1 , . . . , rn ∈ Z≥0 be such that p1 < p2 < · · · < pn
and 0 ≤ ri < pi for all 1 ≤ i ≤ n. Let 1 ≤ e ≤ n be an integer. The Chinese remaindering
with errors problem (or CRT list decoding problem) is to compute an integer 0 ≤ x < X
(if it exists) such that
x ≡ ri (mod pi )
for all but e of the indices 1 ≤ i ≤ n.
Note that it is not assumed that the integers pi are coprime, though in many applications
they will be distinct primes or prime powers. Also note that there is not necessarily a
solution to the problem (for example, if X and/or e are too small).
Exercise 19.4.13 A naive approach to this problem is to run the Chinese remainder algo-
rithm for all subsets S ⊆ {p1 , . . . , pn } such that #S = (n − e). Determine the complexity
of this algorithm. What is the input size of a Chinese remainder with errors instance when
0 ≤ ri < pi ? Show that this algorithm is not polynomial in the input size if e > log(n).
The basic idea of Boneh’s method is to construct a polynomial F (x) ∈ Z[x] such that
all solutions x to the Chinese remaindering with errors problem instance are roots of F (x)

over Z. This is done as follows. Define P = ni=1 pi and let 0 ≤ R < P be the solution to
the Chinese remainder instance (i.e., R ≡ ri (mod pi ) for all 1 ≤ i ≤ n). For an integer x
19.4 Some applications of Coppersmith’s method 395

define the amplitude amp(x) = gcd(P , x − R) so that, if the pi are coprime and S is the set

of indices 1 ≤ i ≤ n such that x ≡ ri (mod pi ), amp(x) = i∈S pi . Write F (x) = x − R.
The problem is precisely to find an integer x such that |x| < X and F (x) ≡ 0 (mod M) for
some large integer M | P . This is the problem solved by Coppersmith’s algorithm in the
variant of Exercise 19.4.5. Note that p1n ≤ P ≤ pnn and so n log(p1 ) ≤ log(P ) ≤ n log(pn ).
Theorem 19.4.14 Let X, e, p1 , . . . , pn , r1 , . . . , rn be an instance of the Chinese remainder
with errors problem, where p1 < p2 < · · · < pn . Let P = p1 · · · pn . There is an algorithm
to compute all x ∈ Z such that |x| < X and x ≡ ri (mod pi ) for all but e values 1 ≤ i ≤ n
as long as
log(pn )
e ≤n−n log(X)/ log(P ).
log(p1 )
The algorithm is polynomial-time in the input size.
Proof Boneh [70] gives a direct proof, but we follow Section 4.7 of May [371] and derive
the result using Exercise 19.4.5.
Let 0 ≤ x < X be an integer with M = amp(x) being divisible by at least n − e of the
values pi . We have n log(p1 ) ≤ log(P ) ≤ n log(pn ) and (n − e) log(p1 ) ≤ M ≤ n log(pn ).
2
Write M = P β . Then Coppersmith’s algorithm finds x if X < P β in polynomial-time in n
2
and log(pn ) (note that Exercise 19.4.5 states the result for X < P β − , but we can remove
the  using the same ideas as Remark 19.1.13). Hence, it is sufficient to give a bound on

e so that log(X)/ log(P ) < β 2 (i.e., β > log(X)/ log(P )). Now, β = log(M)/ log(P ) ≥
(n − e) log(p1 )/(n log(pn )). Hence, it is sufficient that
log(p1 )
(n − e) ≥ n log(X)/ log(P ),
log(pn )
which is equivalent to the equation in the theorem. �
Exercise 19.4.15 Suppose p1 , . . . , pn are the first n primes. Show that the above algo-

rithm works when e ≈ n − n log(X) log(n). Hence, verify that Boneh’s algorithm is
polynomial-time in situations where the naive algorithm of Exercise 19.4.13 would be
superpolynomial-time.
Bleichenbacher and Nguyen [65] discuss a variant of the Chinese remaindering with
errors problem (namely, solving x ≡ ri (mod pi ) for small x, where each ri lies in a set
of m possible values) and a related problem in polynomial interpolation. Section 5 of [65]
gives some algorithms for this “noisy CRT” problem.

Smooth integers in short intervals


The above methods can be used to find smooth integers in intervals. Let I = [U, V ] = {x ∈
Z : U ≤ x ≤ V } and suppose we want to find a B-smooth integer x ∈ I if one exists (i.e.,
all primes dividing x are at most B). We assume that V < 2U .
Exercise 19.4.16 Show that if V ≥ 2U then one can compute a power of 2 in [U, V ].
396 Coppersmith’s method and related applications

A serious problem is that only rather weak results have been proven about smooth integers
in short intervals (see Section 4 of Granville [243], Sections 6.2 and 7.2 of Naccache and
Shparlinski [404] or Section 15.3). Hence, we cannot expect to be able to prove anything
rigorous in this section. On the other hand, it is natural to conjecture that, at least most
of the time, the probability that a randomly chosen integer in an short interval [U, V ] is
B-smooth is roughly equal to the probability that a randomly chosen integer of size V is
B-smooth. Multiplying this probability by the length of the interval gives a rough guide to
whether it is reasonable to expect a solution (see Remark 15.3.5). Hence, for the remainder
of this section, we assume that such an integer x exists. We now sketch how the previous
results might be used to find x.
Let W = (U + V )/2 and X = (V − U )/2 so that I = [W − X, W + X]. We seek all
x ∈ Z such that |x| ≤ X and x ≡ −W (mod piei ) for certain prime powers where pi ≤ B.
Then W + x is a potentially smooth integer in the desired interval (we know that W + x has
a large smooth factor, but this may not imply that all prime factors of W + x are small if W

is very large). One therefore chooses P = li=1 piei where p1 , . . . , pl are the primes up to
B and the ei are suitably chosen exponents (e.g. ei = log(W )/(log(B) log(pi )) ). One then
applies Boneh’s algorithm. The output is an integer with a large common divisor with P
(indeed, this is a special case of the approximate GCD problem considered in Section 19.6).
Note that this yields rather “dense” numbers, in the sense that they are divisible by most of
the first l primes.

Example 19.4.17 Let P = 24 · 32 · 5 · 7 · 11 · 13 · 17 · 19 = 232792560. Let W =


100000007 = 108 + 7 and X = 1000000 = 106 . We want to find an integer x between
W − X and W + X such that x is divisible by most of the prime powers dividing P .
Taking R = −W , a = 4 and a  = 3 in the notation of Theorem 19.4.14 gives the lattice
given by the basis matrix
 
P4 0 0 0 0 0 0
−RP 3 P 3X 0 0 0 0 0
 
 R 2 P 2 −2RP 2 X P 2 X2 0 0 0 0
 
 
−R 3 P 3R 2 P X −3RP X2 P X3 0 0 0 .
 
 R4 −4R 3 X 6R 2 X2 −4RX3 X4 0 0
 
 0 R4X −4R 3 X2 6R 2 X3 −4RX4 X5 0
0 0 R 4 X2 −4R 3 X3 6R 2 X4 −4RX5 X6
The polynomial corresponding to the first row of the LLL-reduced basis is

F (x) = −74 (x + 231767)4

giving the solution x = −231767. Indeed

W − 231767 = 24 · 33 · 5 · 11 · 13 · 17 · 19.

Note that the algorithm does not output 108 = 28 · 58 since that number does not have a
very large gcd with P .
19.5 Simultaneous Diophantine approximation 397

Exercise 19.4.18 Repeat the above example for W = 150000001 = 1.5 · 108 + 1 and W =
46558000.

If this process fails then one can make adjustments to the value of P (for example, by
changing the exponents ei ). Analysing the probability of success of this approach is an
open problem.

19.5 Simultaneous Diophantine approximation


Let α ∈ R. It is well-known that the continued fraction algorithm produces a sequence
of rational numbers p/q such that |α − p/q| < 1/q 2 . This is the subject of Diophantine
approximation; see Section 1.1 of Lovász [356] for background and discussion. We now
define a natural and important generalisation of this problem.

Definition 19.5.1 Let α1 , . . . , αn ∈ R and let  > 0. Let Q ∈ N be such that Q ≥  −n . The
simultaneous Diophantine approximation problem is to find q, p1 , . . . , pn ∈ Z such
that 0 < q ≤ Q and

|αi − pi /q| ≤ /q (19.4)

for all 1 ≤ i ≤ n.

A theorem of Dirichlet mentioned in Section 1.1 of [356] and Section 17.3 of [220]
shows that there is a solution satisfying the constraints in Definition 19.5.1.

Exercise 19.5.2 Let  ≥ 1/2. Prove that integers p1 , . . . , pn satisfying equation (19.4)
exist for any n and q.

A major application of lattice reduction is to give an algorithm to compute the integers


(q, p1 , . . . , pn ) in Definition 19.5.1. In practice, the real numbers α1 , . . . , αn are given to
some decimal precision (and so are rational numbers with coefficients of some size). The
size of an instance of the simultaneous Diophantine approximation is the sum of the bit
lengths of the numerator and denominator of the given approximations to the αi , together
with the bit length of the representation of  and Q. Let X be a bound on the absolute value
of all numerators and denominators of the αi . The computational task is to find a solution
(q, p1 , . . . , pn ) in time that is polynomial in n, log(X), log(1/) and log(Q).

Theorem 19.5.3 Let α1 , . . . , αn ∈ Q be given as rational numbers with numerator


and denominator bounded in absolute value by X. Let 0 <  < 1. One can com-
pute in polynomial-time integers (q, p1 , . . . , pn ) such that 0 < q < 2n(n+1)/4  −(n+1) and
|αi − pi /q| ≤ /q for all 1 ≤ i ≤ n.
398 Coppersmith’s method and related applications

Proof Let Q = 2n(n+1)/4  −n and consider the lattice L ⊆ Qn+1 with basis matrix
 
/Q α1 α2 ··· αn
 0 −1 0 ··· 0
 
 0 0 −1 
 . (19.5)
 . .. .. .. 
 .. . . . 
0 0 ··· −1

The dimension is n + 1 and the determinant is /Q = 2−n(n+1)/4  n+1 . Every vector in the
lattice is of the form (q/Q, qα1 − p1 , qα2 − p2 , . . . , qαn − pn ). The entries of the lattice
are ratios of integers with absolute value bounded by max{X, 2n(n+1)/4 / n+1 }.
Note that the lattice L does not have a basis with entries in Z, but rather in Q.
By Remark 17.5.5 the LLL algorithm applied to L runs in O(n6 max{n log(X), n2 +
n log(1/)}3 ) bit operations (which is polynomial in the input size) and outputs a non-
zero vector v = (q/Q, qα1 − p1 , . . . , qαn − pn ) such that

v ≤ 2n/4 det(L)1/(n+1) = 2n/4 2−n/4  =  < 1.

If q = 0 then v = (0, −p1 , . . . , −pn ) with some pi


= 0 and so v ≥ 1, and so q
= 0.
Without loss of generality q > 0. Since v∞ ≤ v it follows that q/Q ≤  < 1 and so
0 < q < Q/ = 2n(n+1)/4  −(n+1) . Similarly, |qαi − pi | <  for all 1 ≤ i ≤ n. �

Exercise 19.5.4 Let α1 = 1.555111, α2 = 0.771111 and α3 = 0.333333. Let  = 0.01 and
Q = 106 . Use the method of this section to find a good simultaneous rational approximation
to these numbers.

See Section 17.3 of [220] for more details and references.

19.6 Approximate integer greatest common divisors


The basic problem is the following. Suppose positive integers a and b exist such that
d = gcd(a, b) is “large”. Suppose that one is not given a and b, but only approximations
ã, b̃ to them. The problem is to find d, a and b. One issue is that there can be surprisingly
many solutions to the problem (see Example 19.6.4), so it may not be feasible to compute
all solutions for certain parameters. On the other hand, in the case b̃ = b (i.e., one of the
values is known exactly, which often happens in practice) there are relatively few solutions.
Howgrave-Graham [269] has considered these problems and has given algorithms that
apply in various situations. We present one of the basic ideas. Let a = ã + x and b = b̃ + y.
Suppose ã < b̃ and define qa = a/d and qb = b/d. Then, since qa /qb = a/b, we have

ã qa qa y − qb x
− = . (19.6)
b̃ qb b̃qb
19.6 Approximate integer greatest common divisors 399

If the right hand side of equation (19.6) is small then performing Euclid’s algorithm on ã/b̃
gives a sequence of possible values for qa /qb . For each such value, one can compute

b̃/qb = (dqb − y)/qb = d + −y/qb .

If |y| < 12 qb then one has computed d exactly and can solve ã + x ≡ b̃ + y ≡ 0 (mod d).
Note that one must use the basic extended Euclidean algorithm, rather than the improved
method using negative remainders as in Algorithm 1.

Exercise 19.6.1 Show that if a < b < b̃, b2/3 < d < 2b2/3 and |x|, |y| < 14 b1/3 then the
above method finds d, a and b.

Exercise 19.6.2 Let the notation be as above. Suppose |x|, |y| < b̃β and d = b̃α . Explain
why it is natural to assume α > β. Show that the above method succeeds if (ignoring
constant factors) β < −1 + 2α and β < 1 − α

Exercise 19.6.3 Re-formulate this method in terms of finding a short vector in a 2 × 2


matrix. Derive the same conditions on α and β as in Exercise 19.6.2.

Example 19.6.4 Let ã = 617283157 and b̃ = 630864082. The first few convergents qa /qb
to ã/b̃ are 1, 45/46, 91/93, 409/418, 500/511, 1409/1440 and 1909/1951. Computing
approximations to ã/qa and b̃/qb for these values (except the first) gives the following
table.
ã/qa 13717403.5 6783331.4 1509249.8 1234566.3 438100.2 323354.2
.
b̃/qb 13714436.6 6783484.8 1509244.2 1234567.7 438100.1 323354.2
Any values around these numbers can be used as a guess for d. For example, taking
d = 13717403 one finds ã − 22 ≡ b̃ + 136456 ≡ 0 (mod d), which is a not particularly
good solution.
The four values d1 = 1234566, d2 = 1234567, d3 = 438100 and d4 = 323354 lead
to the solutions ã − 157 ≡ b̃ − 856 ≡ 0 (mod d1 ), ã + 343 ≡ b̃ − 345 ≡ 0 (mod d2 ), ã −
257 ≡ b̃ − 82 ≡ 0 (mod d3 ) and ã − 371 ≡ b̃ − 428 ≡ 0 (mod d4 ).

Howgrave-Graham gives a more general method for solving the problem that does not
require such a strict condition on the size of y. The result relies on heuristic assumptions
about Coppersmith’s method for bivariate integer polynomials. We state this result as
Conjecture 19.6.5.

Conjecture 19.6.5 (Algorithm 14 and Section 4 of [269]) Let 0 < α < 2/3 and β <
1 − α/2 − 1 − α − α 2 /2. There is a polynomial-time algorithm that takes as input ã < b̃
and outputs all integers d > b̃α such that there exist integers x, y with |x|, |y| < b̃β and
d | (ã + x) and d | (b̃ + y).

Exercise 19.6.6 Let ã, b̃, X, Y ∈ N be given with X < ã < b̃. Give a brute force algorithm
to output all d > Y such that there exist x, y ∈ Z with |x|, |y| ≤ X and d = gcd(ã + x, b̃ +
y). Show that the complexity of this algorithm is O(X 2 log(b̃)2 ) bit operations.
400 Coppersmith’s method and related applications

We now mention the case when b̃ = b (in other words, b is known exactly). The natural
approach is to consider the polynomial F (x) = ã + x, which has a small solution to the
equation F (x) ≡ 0 (mod d) for some d | b. Howgrave-Graham applies the method used in
Section 19.4.2 to solve this problem.
Theorem 19.6.7 (Algorithm 12 and Section 3 of [269]) Let 0 < α < 1 and β < α 2 . There
is a polynomial-time algorithm that takes as input ã, b and outputs all integers d > bα such
that there exists an integer x with |x| < bβ and d | (ã + x) and d | b.

19.7 Learning with errors


The learning with errors problem was proposed by Regev. There is a large literature on
this problem; we refer to Micciancio and Regev [379] and Regev [446] for background and
references.
Definition 19.7.1 Let q ∈ N (typically prime), σ ∈ R>0 , and n, m ∈ N with m > n.1 Let
s ∈ (Z/qZ)n . The LWE distribution is the distribution on (Z/qZ)m×n × (Z/qZ)m corre-
sponding to choosing uniformly at random an m × n matrix A with entries in Z/qZ and a
length m vector
c ≡ As + e (mod q)
where the vector e has entries chosen independently from a discretised normal distribution2
on Z with mean 0 and standard deviation σ . The learning with errors problem (LWE)
is: given (A, c) drawn from the LWE distribution to compute the vector s. The decision
learning with errors problem (DLWE) is: given A as above and a vector c ∈ (Z/qZ)m to
determine whether (A, c) is drawn from the uniform distribution or the LWE distribution.
It is necessary to argue that LWE is well-defined since, for any choice s  , the value
c − As  (mod q) is a possible choice for e. But, when m is sufficiently large, one value for
s is much more likely to have been used than any of the others. Hence, LWE is a maximum
likelihood problem. Similarly, DLWE is well-defined when m is sufficiently large: if c is
chosen uniformly at random and independent of A then there is not likely to be a choice for
s such that c − As (mod q) is significantly smaller than the other values c − As  (mod q).
We do not make these arguments precise. It follows that m must be significantly larger than
n for these problems to be meaningful. It is also clear that increasing m (but keeping n
fixed) does not make the LWE problem harder.
We refer to [379] and [446] for surveys of cryptographic applications of LWE and
reductions, from computational problems in lattices that are believed to be hard, to LWE.
Note that the values m, q and σ in an LWE instance are usually determined by constraints
coming from the cryptographic application, while n is the main security parameter.

1
For theoretical applications one should not assume a fixed number m of rows for A. Instead, the attacker is given an oracle that
outputs pairs (a, c) where a is a row of A and c = a s + e (mod q).
2 2 2
In other words, the probability that ei is equal to x ∈ Z is proportional to e−x /(2σ ) .
19.7 Learning with errors 401

Example 19.7.2 Table 3 of Micciancio and Regev [379] suggests the parameters

(n, m, q, σ ) = (233, 4536, 32749, 2.8).

Lindner and Peikert [352] suggest (using Figure 4 and the condition m ≥ 2n +  with
 = 128)

(n, m, q, σ ) = (256, 640, 4093, 3.3).

Exercise 19.7.3 Show that if one can determine e then one can solve LWE efficiently.

Exercise 19.7.4 � Show that, when q is prime, LWE ≤R DLWE. Show that DLWE ≤R
LWE.

We now briefly sketch two lattice attacks on LWE. These attacks can be avoided by
taking appropriate parameters. For other attacks on LWE see [446].

Example 19.7.5 (Lattice attack on DLWE using short vectors in kernel lattice modulo q.)
Suppose one can find a short vector w in the lattice
. /
w ∈ Zm : wA ≡ 0 (mod q) .

Then w c = wAs + w e ≡ w e (mod q). If w is short enough then one might expect that
w e is a small integer. On the other hand, if c is independent of A then w c (mod q) is a
random integer modulo q. Hence, one might be able to distinguish the LWE distribution
from the uniform distribution using short enough vectors w.
Note that one is not obliged to use all the rows of A in this attack, and so one can replace
m by a much smaller value m . For analysis of the best value for m , and for parameters
that resist this attack, see Section 5.4.1 (especially equation (10)) of [379].

Example 19.7.6 (Reducing LWE to CVP.) We now consider a natural approach to solving
LWE using lattices. Since we always use row lattices, it is appropriate to take the transpose
of LWE. Hence, suppose c, s and e are row vectors (of lengths m, n and m respectively)
such that c = sAT + e (mod q).
Consider the lattice
. /
L = v ∈ Zm : v ≡ uAT (mod q) for some u ∈ Zn .

Then L has rank m and a basis matrix for it is computed by taking the (row) Hermite normal
form of the (n + m) × m matrix
 T
A
qIm

where Im is an m × m identity matrix. One then tries to find an element v of L that is close
to c. Hopefully, v = c − e ≡ sAT (mod q).
402 Coppersmith’s method and related applications

One can perform lattice basis reduction and apply the nearest plane algorithm. For
improved methods and experimental results see Lindner and Peikert [352]. As in Exam-
ple 19.7.5, one can work with a subset of m rows of A; see Section 5.1 of [352] for
details.

19.8 Further applications of lattice reduction


There are a number of other applications of lattices in cryptography. We briefly list some
of them.
� The improvement by Boneh and Durfee of Wiener’s attack on small private exponent

� Solving the hidden number problem in finite fields and its applications to bit security of
RSA. This is briefly mentioned in Section 24.5.1.

� The attack by Howgrave-Graham and Smart on digital signature schemes in finite fields
Diffie–Hellman key exchange. See Section 21.7.1.

� The deterministic reduction by Coron and May from knowing ϕ(N ) to factoring N . This
when there is partial information available about the random nonces. See Section 22.3.

is briefly mentioned in Section 24.1.3.

You might also like