Diophantine Analysis Thesis by Jorn Steuding

Download as pdf or txt
Download as pdf or txt
You are on page 1of 79
At a glance
Powered by AI
The document provides an introduction to the theory of Diophantine approximations with a focus on its impact on Diophantine equations. Some of the topics covered include Fermat's last theorem, Pythagorean triples, and the work of mathematicians like Diophantus and Fermat.

Diophantine analysis is the study of polynomial equations with integer or rational solutions. This course provides an introduction to the topic and covers things like Diophantine approximations, Fermat's last theorem, and Pythagorean triples.

Fermat claimed to have a proof that the only integer solutions to the equation X^n + Y^n = Z^n are the trivial ones where XYZ=0, whenever n is greater than or equal to 3. However, he never published a proof and it took centuries for others to prove it, known as Fermat's Last Theorem.

Diophantine Analysis

Jorn Steuding
Winter 2002/03
Johann Wolfgang Goethe-Universit at Frankfurt
This course gives an introduction to the theory of Diophantine Approximations with
a special emphasis on its impact to the eld of Diophantine Equations. It relies
mainly on the nicely written books [9], [29] as well as the more general introduction
[8] and the classics [21], [35]; historical details can be found in [11], [48] and on
www.history.mcs.st andrews.ac.uk. Finally, we refer to [4] for modern perspectives
in this amazing and growing eld.
We will use only standard notation. Special knowledge which is beyond the scope
of rst courses in mathematics is (with some small exceptions) not necessary. The
reader can nd some easy and some dicult exercises in the text which may help to
get familar with the topic.
The author is very grateful to Iqbal Lahseb, Ernesto Girondo, Thomas M uller
and Matthias Volz for several remarks and corrections.
1 Introduction: three basic principles
Diophantus of Alexandria was a greek mathematician who lived around 250 A.D.,
but virtually nothing more about his life is known than that what was written down
on his tombstone:
God granted him to be a boy for the sixth part of his life,
and adding a twelfth part to this, He clothed his cheeks with down.
He lit him the light of wedlock after a seventh part,
and ve years after his marriage He granted him a son.
Alas! late-born wretched child; after attaining
the measure of half his fathers life, chill Fate took him.
After consoling his grief by this science of numbers
for four years he ended his life. (cf. [52], p.55)
How old was Diophantus when he died? This riddle is an example for the kind of
problems in which Diophantus was interested in. He wrote the inuential monogra-
phy Arithmetica which inspired Fermat (1607/08?-1665) to write down
1
It is impossible for a cube to be written as a sum of two cubes
or a fourth power to be written as the sum of two fourth powers or,
in general, for any number which is a power greater than the second
to be written as a sum of two like powers.
I have a truly marvellous demonstration of this proposition
which this margin is too narrow to contain. (cf. [52], p.66)
in his copy of Diophantus book, exactly there where the corresponding question for
squares is considered. In the modern language of algebra Fermat claimed to have a
proof that all solutions of the equation
X
n
+ Y
n
= Z
n
(1)
in integers x, y, z are trivial, i.e. xyz = 0, whenever n 3. Fermat never published
a proof and, by the succcessless quest for a solution of Fermats last theorem over
centuries, mathematicians started to believe that Fermat actually had no proof.
However, no counterexample was found. Only recently Wiles (supported by Taylor
and the earlier works of many others) found a proof for Fermats last theorem (see
[52] for the amazing story of this problem and its nal solution). With our modern
background in mathematics one may ask why the ancient Greek did not consider the
Fermat equation in its generality but only the quadratic case. Their point of view
was inspired by at most three-dimensional geometry and only in the late works of
Greek mathematicians higher powers occur. They also had an advanced knowledge
on divisibility and prime numbers but no idea about the unique prime factorization
of the integers!
It is interesting that the exponent in the Fermat equation is crucial. As Dio-
phantus explained in his book the equation (1) with exponent n = 2 has innitely
many (non-trivial) solutions, for example
3
2
+ 4
2
= 5
2
, 5
2
+ 12
2
= 13
2
, 8
2
+ 15
2
= 17
2
, . . . .
These triples of integral solutions are called Pythagorean triples with regard to
Pythagoras theorem in geometry. However, the history of such triples dates at least
back to the ancient Babylonians four millenia ago. It is conjectured that they used
Pythagorean triples for constructing right angles. Pythagoras (572-492 B.C.) found
not only a proof that such triples yield right angles but gave also an innitude of
primitive (coprime) Pythagorean triples by the identity
(2n + 1)
2
+ (2n
2
+ 2n)
2
= (2n
2
+ 2n + 1)
2
for n N.
In the third century B.C. Euclid solved the problem of nding all solutions.
Exercise 1 Show that all positive integral solutions x, y, z of
X
2
+ Y
2
= Z
2
,
2
satisfying (x, y) = 1 and 2[x, are given by
x = 2ab, y = a
2
b
2
, z = a
2
+ b
2
,
where a and b are coprime integers of opposite parity and a > b. [Hint: each solution
satises x
2
= z
2
y
2
= (z + y)(z y).]
Note that this classication of the Pythagorean triples can be used to prove that
the biquadratic case of (1) has only trivial solutions. Actually, Fermat combined
this result with his method of innite descent to prove that the slightly more general
equation
X
4
+ Y
4
= Z
2
has no positive integral solutions (see [21], 13.3).
We are living in a nite universe (there are only about 10
80
atoms in our uni-
verse) which means that our world can be described using only rational numbers.
However, for an understanding of the physical world as well as the abstract world
of mathematics this is not enough! The same equation which Euclid studied so suc-
cessfully led some hundred years before to one of the great breakthroughs in ancient
Greek mathematics. Hippasus, a pupil of Pythagoras, discovered that the set of
rational numbers is too small for the simple geometry of triangles and squares. In
fact, he proved that the length of the diagonal of a unit square is irrational:

2 =

1
2
+ 1
2
, Q.
Nowadays this is taught at school, but for the Pythagoras school it was the death
of their philosophy that all natural phenomenon could be explained in integers. It
is said that Pythagoras sentenced Hippasus to death by drowning. This unkind act
could not stop the mathematical progress. In order to solve polynomial equations
and for the merits in analysis the mathematicians invented various new numbers:
N Z Q R C,
and even more as we will see in section 16; we refer the interested reader to the
collection [12] of excellently written surveys on numbers of all kinds. However, for
the rst we shall only consider real numbers.
By Cantors diagonal argument we know that the set of real numbers R is un-
countable whereas Q is countable. Therefore, almost all real numbers are irrational.
We shall give a more advanced example than the square root of two. One of the fun-
damental constants in analysis (and natural sciences) is the value of the exponential
function at one:
e =

n=0
1
n!
.
Theorem 1 e is irrational.
3
This result dates back to Euler in 1737, resp. Lambert in 1760; however their
approach via continued fractions is rather dicult (we will return to this later).
However, the following simple proof would have been possible in Eulers times.
Proof. Suppose the contrary, then there exist positive integers a, b such that e =
a
b
.
If m is an integer b then b divides m! and
= m!
_
e
m

n=0
1
n!
_
= a
m!
b

m

n=0
m!
n!
is an integer. On the other side,
0 < =

n=m+1
m!
n!
<
1
m + 1

k=0
_
1
m + 1
_
k
=
1
m
,
which gives the contradiction. The theorem is proved.
This proof reveals an important principle in the diophantine toolbox: the series
converges so fast that the limit cannot be of a restricted arithmetical nature!
The question whether a given real number is irrational seems to be simple on the
rst view. Actually, this is a rather dicult problem. For instance, it is unknown
whether the Euler-Mascheroni constant
= lim
N
_
N

n=1
1
n
log N
_
is irrational. Another important constant is , which we dene as the smallest
positive root of the sine function.
Theorem 2 is irrational.
The rst proof of the irrationality of was given by Lambert in 1761, also by
using continued fractions. The proof which we shall give now, as all other known
proofs, is slightly more dicult than the given one for e. This holds also in other
questions around these two fundamental numbers. It seems that e has somehow
more structure than . Our short but tricky proof is due to Niven [37].
Proof. We start with some preliminaries. For n N dene
f
n
(x) =
1
n!
x
n
(1 x)
n
. (2)
Hence,
f
n
(x) =
1
n!
2n

j=n
c
j
x
j
with c
j
Z
and
0 < f
n
(x) <
1
n!
for 0 < x < 1. (3)
4
It is easy to nd for the k-th derivative
(1)
k
f
(k)
n
(1) = f
(k)
n
(0) =
_
0 if 0 k < n,
k!
n!
c
k
if n k 2n;
(4)
here we used f
(k)
n
(x) = (1)
k
f
(k)
n
(1 x) and the Taylor expansion (or dumb com-
putation). Note that all these values are integers.
We shall even show that
2
is irrational. Assume that
2
=
a
b
with positive
integers a and b. We consider the polynomial
F
n
(x) := b
n
(
2n
f
n
(x)
2n2
f
(2)
n
(x) . . . + (1)
n
f
(2n)
n
(x)).
Since b
n

2k
= b
nk
a
k
Z for 0 k n, it follows in view of (4) that F
n
(0), F
n
(1)
Z. A short calculation shows
(F

n
(x) sin(x) F
n
(x) cos(x))

=
2
a
n
f
n
(x) sin(x).
With regard to sin = sin 0 = 0 this yields
J
n
:= a
n
_
1
0
f
n
(x) sin(x) dx = F
n
(0) + F
n
(1),
which is an integer. On the other side we get with regard to (3)
0 < J
n
<
a
n
n!
.
Since the factorial n! grows faster than a
n
, the right hand side above can be made
< 1 by choosing n suciently large. This gives the contradiction and the theorem
is proved.
Also this proof is very interesting: a problem concerning the arithmetic nature of
a given real number is solved by the construction of an appropriate sequence of
polynomials with respect to its analytical behaviour!
As we have seen above almost all real numbers are irrational, but how to deal
with them in our nite world? Since Q is dense in R it is natural to search for
rational approximations (but Q has also the disadvantage of beeing very thin in
R). For example, we can think about some machine which has to compute an
approximation to the circumference of a circle with given radius. For that we have
to construct some mechanism with gears which approximates the ratio : 1, but
this needs a good approximation of the irrational
= 3.14159 26535 . . .
by rational numbers. It is interesting to see how this problem was solved in former
times:
The Rhind Papyrus ( 1650 B.C.): 4
_
8
9
_
2
= 3.16045 . . . ;
5
Old testament ( 1000 B.C.): 3;
Archimedes (287-212 B.C.):
22
7
= 3.14285 . . . ;
Tsu Chung Chi ( 500 A.D.):
355
113
= 3.14159 29920 . . . .
We shall not speak about the attempt of an american politician to attach by law
the value 3.2 in 1897; for this and more on see [5]. More useful is the rhyme
Now I want a drink, alcoholic of course,
after the heavy lectures involving quantum mechanics!
Recently, Kanada and Takahashi computed up to more than 206 billion digits.
This precision is beyond any use in applications (the Planck constant 10
33
is the
smallest unit in quantum mechanics) but interesting from a mathematical point
of view. Since Q is dense in R, for any real number there exist inntely many
rational approximations. Moreover, we can approximate with any assigned degree
of accuracy. But what are good and what are bad approximation among them or,
what is the same, how rapidly can we approximate? A natural measure is the
denominator. Let be any real number, then we say that
p
q
Q with q 1 is a
best approximation to if
[q p[ < [Q P[ for all Q < q,
where P, Q Z; necessarily a best approximation is a reduced fraction. In particu-
lar, a best approximation
p
q
to is the nearest rational number with denominator
q , but not vice versa! For example, the rst best approximations to are
3
1
,
22
7
,
333
106
,
355
113
,
1 03993
33102
, . . . .
It is remarkable that the fractions given by Archimedes and of Tsu Chung Chi are
best approximations. How could they do that in these dark ages without computers?
In general, nding good rational approximations to a given irrational number is a
dicult task.
In honour of Diophantus we speak about
Diophantine Approximations when we search for rational approximations
to rational or irrational numbers;
Diophantine Equations when we investigate polynomial equations on solu-
tions in integers (or rationals).
As we shall see both areas are linked in various directions. For instance, we ask for
solutions of the equation
38X 15Y = 1 (5)
6
in integers. One approach to answer this question oers Euclids algorithm. By
division with remainder for any positive integers a, b with b > a there exist
integer q, r such that
b = aq + r with 0 r < a.
Nowdene r
1
:= b, r
0
:= a. Then, successive application of division with remainder
yields the Euclidean algorithm:
For j = 0, 1, . . . do r
j1
= q
j+1
r
j
+ r
j+1
with 0 r
j+1
< r
j
. (6)
Since the sequence of remainders is strictly decreasing, the algorithm nishes after
nitely many, say n steps and, by simplest divisibility properties, it turns out that
the last non-vanishing remainder r
n
is equal to the greatest common divisor of a
and b.
Exercise 2 Show that the running time (i.e. the number of steps) in the Euclidean
algorithm is
n
2
log 2
log(b + 1).
Can you describe the smallest numbers a, b for which the algorithm needs exactly n
steps? Can you improve the previous estimate?
From an algorithmical point of view it is of advantage to use a modied Euclid-
ean algorithm which works with the nearest integer and not with the next larger
integer. However, for our purpose the above stated version is sucient. Reading
the Euclidean algorithm backwards, we get a concrete integral solution of the linear
equation
bX aY = (a, b), (7)
say x
0
, y
0
. It is easily seen that then all integral solutions to the latter diophantine
equation are given by
_
x
y
_
=
_
x
0
y
0
_
+
1
(a, b)
_
a
b
_
Z. (8)
From here it is only a small step to
Theorem 3 The linear diophantine equation bX aY = c with integers a, b, c is
solvable if and only if (a, b)[c.
We return to our example. It may be a little bit surprising to see that the
solutions of (5) yield good approximations to
15
38
and vive versa:
. . . , 38 (13) 15 (33) = 1 , 38 2 15 5 = 1 , 38 17 15 43 = 1, . . .
7
vs.
x
y
=
2
5
,
13
33
,
17
43
. . .
15
38
;
note that the solutions x, y with [y[ < 38 give even the best approximations
x
y
to
15
38
. However, this is not too surprising. We may rewrite each solution of our
modied equation as
x
y
=
15
38
+
1
38y
,
and since the second term is rather small, we see a certain relation between approx-
imations to
15
38
and our integral solutions.
Exercise 3 Prove that the best approximations
x
y
to any given
a
b
Q exactly cor-
respond to those solutions x, y Z of (7) with [y[ < b.
With regard to our previous observations we can solve the linear diophantine equa-
tion (7) by searching for the best approximations to
a
b
. This was rst discovered by
Aryabhata around 550 A.D. However, this example is only the top of an iceberg:
certain diophantine equations can be investigated by studying a related problem in
the theory of diophantine approximations!
2 Dirichlets approximation theorem
It is much more interesting to approximate irrational numbers than rationals. In
1842 Dirichlet proved the rst but fundamental approximation theorem.
Theorem 4 If is irrational, then there exist innitely many rational numbers
p
q
such that


p
q

<
1
q
2
; (9)
this property characterizes irrational numbers, i.e., a rational allows only nitely
many approximations
p
q
with (9).
In particular, there exist innitely many best approximations to any given irrational
number (in contrast to rationals; see Exercise 3).
The following original proof relies on the pigeonhole principle which states
that if n + 1 objects are distributed to n boxes, then at least one box contains at
least two objects (which is easily seen by a contradiction argument or induction).
Proof. We dene the integral part and the fractional part of a real number x
by
[x] = maxz Z : z x and x = x [x],
respectively. Let Q be a positive integer. The numbers
0, , 2, . . . , Q
8
dene Q+ 1 points distributed among the Q disjoint intervals
j 1
Q
x <
j
Q
for j = 1, . . . Q.
By the pigeonhole principle there has to be at least one interval which contains at
least two numbers k > , say, with 0 k, Q. It follows that
0 k = k [k] + []
= (k ) + [(k )] + [] [k]
. .
Z
<
1
Q
.
Obviously, the integral parts add up to zero. Hence, setting q = k we obtain
q = k <
1
Q
.
With p := [q] it follows that


p
q

=
[q p[
q
<

qQ
<
1
qQ
, (10)
which implies the estimate (9) (since q < Q).
Now assume that is rational, say =
a
b
. If ,=
p
q
, then


p
q

=
[aq bp[
bq

1
bq
(11)
and (9) involves q < b, which proves that there are only nitely many
p
q
with (9).
Now suppose that is irrational and that there exist only nitely many solutions
p
1
q
1
, . . . ,
pn
qn
to (9). Since , Q, we can nd a Q such that


p
j
q
j

>
1
Q
for j = 1, . . . , n,
contradicting (10). The theorem is proved.
Notice that by this proof it is impossible to compute approximations without big
computational eort.
Dirichlets approximation theorem can be extended to a rst irrationality cri-
terium:
Exercise 4 Prove that if there are innitely many coprime solutions p, q of
[q p[ < q

with a xed > 0, then is irrational. [Hint: using (11) show that for rational the sets
of possible denominators q and of possible numerators p are nite.]
Deduce the irrationality of

n=1
2
n!
.
[The idea behind should be compared with our proof of the irrationality of e.]
9
More irrationality criteria of this type can be found in [8], 5.
Dirichlets proof yields also an extension to simultaneous approximation prob-
lems.
Exercise 5 Prove that if
1
, . . . ,
n
are arbitray real numbers, then the system of
inequalities

j

p
j
q

< q
1
1
n
for j = 1, . . . , n
has at least one solution p
1
, . . . p
n
, q Z; if at least one
j
is irrational, then it has
an innitude of solutions.
We say that is approximable by rationals of order if there exists a
positive constant c(), depending only on , such that


p
q

<
c()
q

has an innity of solutions


p
q
. In a sense, indicates the speed of convergence of
the sequence
p
q
to . With view to our previous observations we see that
any rational is approximable of order 1, and to no higher order (by Exercises
3 and 5);
any irrational is at least approximable of order 2 (by Theorem 4).
It is a natural question to ask for improvements of Dirichlets theorem. Khintchine
proved in the 1930s a remarkable result which states that the set of numbers which
may allow a stronger approximation has measure null, i.e given any positive ,
the set can be covered by a countable number of intervals of total length < .
Theorem 5 Suppose that is a positive function such that

q=1
(q)
converges. Then for almost all (i.e. with exceptions which have measure 0), there
is only a nite number of solutions p, q Z to the inequality
[q p[ < (q). (12)
The proof relies on Dirichlets idea of locating the real number in question in
small intervals.
Proof. Given > 0, we can nd an integer Q such that

qQ
(q) <

2
.
10
Now consider those for which the inequality (12) has innitely many solutions.
Now for each q Q consider the intervals of radius
(q)
q
surrounding the rational
numbers
0
q
,
1
q
, . . . ,
q1
q
. Consequently, each will lie in one of these intervals. The
measure of these intervals is

qQ
q
2(q)
q
< ,
which proves the theorem.
For example, we may take (q) = q
1
(log q)
1
. Thus almost all numbers cannot
be approximated by an order 2+. A more subtile characterization of real numbers
with respect to the order of approximation seems to be a rather dicult task. We
will return to this problem in a later section.
We shall briey give an interesting application of Dirichlets approximation the-
orem to dense sequences. As we have seen above for any given real we can nd
integers q for which q diers from an integer by as little as we please. Kroneckers
celebrated approximation theorem from 1891 considers the inhomogeneous case.
Theorem 6 If is irrational, R is arbitrary, then for any N N there exist
Q N with Q > N and P Z such that
[Q P [ <
3
Q
.
Proof. By Theorem 4 there are integers q > 2N and p such that
[q p[ <
1
q
.
Suppose that m is the integer, or one of the two integers, for which
[m q[
1
2
.
In view of Theorem 3 and in addition (7) we can write m = px qy, where x and
y are integers and [x[
1
2
q . Since
q(x y ) = x(q p) (q m),
we nd
[q(x y )[ <
1
2
q
1
q
+
1
2
= 1.
Setting Q = q + x and P = p + y yields
N <
1
2
q Q
3
2
q,
and thus
[Q P [ [x y [ +[q p[ <
2
q

3
N
.
11
This proves the theorem.
In particular, Kroneckers approximation theorem has interesting consequences on
dense sequences:
Exercise 6 Show that the sequence (n) lies dense in [0, 1] if and only if is
irrational.
Much more can be said about such sequences. In fact, one can show that if and
only if is irrational, the sequence (n) is uniformly distributed modulo 1,
i.e., the correct proportion of points n lies in an arbitrary subinterval (a, b) of
(0, 1). On the other side, it is not dicult to show that for example the fractional
parts of log n prefer small values where, as usual in number theory, log n is the
logarithm to base e. But it is yet unproved that the sequence exp(n) is uniformly
distributed modulo 1. For a rigorous denition of uniform distribution, the current
knowledge in this eld and its various applications we refer the reader to [23].
Another interesting application of Kroneckers approximation theorem is the
problem of the reected light ray in plane geometry due to K onig and Sz ucs. The
sides of a square are reecting mirrors. A ray of light leaves a point inside the square
and is reected in the mirrors. One can show that its path is closed and periodic
if and only if the angle between a side of the square and the initial direction of the
ray has a rational tangent; otherwise the ray of light passes arbitrarily near to every
point of the square. For the proof of this entertaining billiards see [21], 23.3.
In the following sections we will meet two dierent approaches for more detailed
studies concerning diophantine approximations.
3 The Farey sequence
Farey fractions were introduced by Haros in 1802 and (independently) by Farey in
1816. However, Cauchy was the rst who studied them systematically.
For any positive integer n the Farey sequence T
n
of order n is the ordered
list of all reduced fractions in the unit interval having denominators n:
T
n
=
_
a
b
Q : 0 a b n with (a, b) = 1
_
.
For example,
T
1
=
_
0
1
,
1
1
_
T
2
=
_
0
1
,
1
2
,
1
1
_
T
3
=
_
0
1
,
1
3
,
1
2
,
2
3
,
1
1
_
. . . . (13)
Clearly, T
n
T
n+1
. Further, each rational in the unit interval occurs sooner or
later in the Farey sequence. Hence, the Farey sequence is building Q modulo Z.
In particular, this construction proves that Q is countable. We can be a bit more
precise to get a feeling for the size of T
n
. It is easily seen that the number of Farey
fractions in T
n
is related to Eulers totient (b) which counts the number of positive
12
coprime integers a b. With some knowledge on arithmetical functions one can
show the asymptotic formula
T
n
= 1 +

bn
(b) =
3

2
n
2
+ O(nlog n),
where f(x) = O(g(x)) with a positive function g(x) denotes that
limsup
x
[f(x)[
g(x)
is bounded.
Two consecutive elements in T
n
are called neighbours. One fundamental prop-
erty of Farey fractions is the spacing of neighbours in the unit interval.
Theorem 7 For any neighbours
a
b
<
c
d
in T
n
,
bc ad = 1.
In particular, under the assumptions of the theorem
c
d

a
b
=
bc ad
bd
=
1
bd
.
This shows that T
n
is not uniformly, but in a diophantine sense, optimally distrib-
uted, namely with regard to the denominators.
Proof. Consider the diophantine equation
bX aY = 1.
Since a and b are coprime, by Theorem 3 and in addition (8) there exists a solution
x, y Z with nb < y n. Consequently, also x and y are coprime, and therefore
x
y
T
n
. It is easy to compute that
x
y
=
a
b
+
1
by
>
a
b
. (14)
Now suppose that
c
d
<
x
y
, then
x
y

c
d
=
dx cy
dy

1
dy
.
Further,
c
d

a
b
=
bc ad
bd

1
bd
.
These estimates imply
x
y

a
b
=
x
y

c
d
+
c
d
. .
=0

a
b

1
bd
+
1
dy
=
y + b
bdy
>
n
bdy
.
13
In view of (14) it follows that n < d, contradicting
c
d
T
n
. Thus we have
c
d
=
x
y
which proves the theorem.
Note that the proof gives a rule for the computation of the successor of a Farey
fraction
a
b
in T
n
. This term is also related to the former right neighbour of
a
b
. We
dene the mediant of
a
b
,
c
d
T
n
by
a+c
b+d
(like adding two fractions the wrong way).
If
a
b
<
c
d
, then it is easily seen that
a
b
<
a + c
b + d
<
c
d
.
The mediant is a mean value, but dierent to other more common mean values
like the arithmetic or the geometric mean value, the mediant is not monotonic: for
example,
1
6
<
1
3
and
23
36
<
2
3
, but
1 + 23
6 + 36
=
4
7
>
1
2
=
1 + 2
3 + 3
.
This phenomenon is well-known in statistics where it is called Simpsons paradox
(though it is not paradox).
Exercise 7 Show that for any given pair
A
B
<
C
D
there exist innitely many pairs
a
b
<
A
B
,
c
d
<
C
D
such that
a + c
b + d
>
A+ B
C + D
.
For this and more see [53].
The importance of the notion of mediants becomes clear by investigating (13).
The process of taking mediants builds the Farey sequence out of
0
1
and
1
1
.
Theorem 8 The fractions which belong to T
n
but not to T
n1
are mediants of
elements of T
n
.
In particular, the property for two consecutive Farey fractions
a
b
,
c
d
of being neigh-
bours gets lost in T
n
if n b + c.
Proof. With regard to Theorem 7 we have for consecutive Farey fractions
a
b
<
x
y
<
c
d
in T
n
the identities
bx ay = 1 and cy dx = 1.
Multiplying this with c and a, resp. d and b, yields
x(bc ad) = a + c and y(bc ad) = b + d,
14
which implies the second assertion.
The Farey sequence is related to an amazing geometry, discovered by Ford in the
1940s. We may embed the unit interval [0, 1] into the complex plane C and dene
for each
a
b
T
n
the so-called Ford circle
(
_
a
b
_
=
_
z C :

z
_
a
b
+
i
2b
2
_

=
1
2b
2
_
,
where as usual i :=

1. Ford circles provide an interesting sphere packing:


Exercise 8 Prove that two distinct Ford circles i) have an empty intersection of the
interiors, and ii) are tangent if and only if they are associated to neighbours in the
Farey sequence.
We return to our problem to nd explicitly good approximations to a given in the
unit interval. If we want to get a rational approximation with denominator n,
then we only have to look for the Farey fraction
a
b
T
n
with minimal distance to
. Further approximations can be found by taking the mediants. It is obvious how
one can use Ford circles to localize visually these Farey fractions.
Exercise 9 The golden ratio is given by g =
1
2
(

51). Find all best approxima-


tions
p
q
to g with a denominator q 100. Do you see some hidden law? [Compare
your observations with the numbers for which the Euclidean algorithm is extremal; see Exercise 2.]
Compute the quantities q[qg p[ . Do the same for =

2.
The Farey dissection of the continuum is a very usefool tool in the theory of diophan-
tine approximations. It provides not only an approach to explicit approximations
but also an improvement of Dirichlets approximation theorem, found by Hurwitz
in 1891.
Theorem 9 If is irrational, then there exist innitely many rational numbers
p
q
such that


p
q

<
1

5q
2
;
the constant

5 is best possible.
Proof. Suppose that
a
b
<
c
d
are those neighbours in T
n
for which
a
b
< <
c
d
.
Without loss of generality we assume that >
a+c
b+d
(since we may replace by
1 otherwise). If now

a
b

1

5b
2
,
a + c
b + d

1

5(b + d)
2
and
c
d

1

5d
2
, (15)
15
then we obtain by adding these inequalities
c
d

a
b

1

5
_
1
b
2
+
1
d
2
_
=
1

5
b
2
+ d
2
b
2
d
2
and
c
d

a + c
b + d

1

5
_
1
(b + d)
2
+
1
d
2
_
=
1

5
(b + d)
2
+ d
2
(b + d)
2
d
2
.
On the other side, Theorem 7 and 8 give upper bounds for these quantities, which
lead to the inequalities

5bd b
2
+ d
2
and

5(b + d)d (b + d)
2
+ d
2
.
Adding both inequalities gives

5(d
2
+ 2bd) 3d
2
+ 2bd + 2b
2
,
resp.
0 4b
2
4(

5 1)bd + (5 2

5 + 1)d
2
= (2b (

5 1)d)
2
,
which is impossible since b, d Z but

5 , Q. Thus, with view to (15) at least


one of the inequalities


a
b

<
1

5b
2
,


a + c
b + d

<
1

5(b + d)
2
and


c
d

<
1

5d
2
holds. Since is irrational this argument yields the existence of innitely many
such rational approximations by sending n . The rst assertion of the theorem
is proved.
For the second one recall the denition of the golden ratio g =
1
2
(

5 1). We
consider the polynomial P(X) := X
2
+ X 1. With G :=
1
2
(

5 + 1) we have
P(X) = (X g)(X + G). Now assume that

g
p
q

<
1
Cq
2
with p Z, q N. Then

P
_
p
q
_

g
p
q

G+
p
q

g
p
q

G+g g
. .
=0

p
q

<

5
Cq
2
+
1
C
2
q
4
.
On the other side,

P
_
p
q
_

=
[p
2
+ pq q
2
[
q
2

1
q
2
,
which implies C

5. This proves the second assertion.


16
By the proof it seems that the restriction on C depends mainly on the fact that we
investigated approximations to g. We call a real number quadratic irrational
if it is the root of an irreducible quadratic polynomial with integral coecients, the
minimal polynomial of . Obviously, any such is of the form
=
a + b

d
c
with a, b Z, c, d N, (16)
where d is squarefree, i.e., all prime divisors of d have multiplicity one; in partic-
ular, a quadratic irrational is irrational.
Exercise 10 Suppose that is the real quadratic irrational with minimal polyno-
mial aX
2
+ bX + c. Show that for C >

b
2
4ac the inequality


p
q

<
1
Cq
2
has only nitely many solutions p, q Z. Compare this with Exercise 9. Try to formulate
a conjecture for quadratic irrationals.
Another interesting topic in context of the Farey sequence is the Riemann
hypothesis which claims that the prime numbers are as uniformly distributed as
possible, namely
(x) =
_
x
2
du
log u
+ O
_
x
1
2
+
_
for any > 0, (17)
where (x) counts the number of primes x. This is problem number 8 of Hilberts
famous list of 23 problems which he presented at the International Congress of
Mathematicians in Paris in 1900. The Riemann hypothesis is yet unsolved (actually
it is one of the seven Millenium problems). Surprisingly, is related to the deviation
of the Farey sequence from being uniformly distributed. Denote the j -th element of
T
n
by f
j
(ordered by their size). Then, by a result of Franel, Riemanns hypothesis
is true if and only if
Fn

j=1

f
j

j 1
T
n

= O
_
n
1
2
+
_
(see [24] for the details).
Further applications of Farey fractions are error-free computing strategies (details
can be found in [46], 5.12) and the dissection of the complex unit circle with regard
to the circle method (see [24]).
4 Continued fractions
The powerful tool of continued fractions was rst systematically studied by the
dutch astronomer Huygens in the 17-th century, motivated by technical problems
17
while constructing a model of our solar system. The most detailed textbook on this
topic is the classic [38].
Recall the intimate relation between the approximations to a given rational
a
b
and the solutions of linear diophantine equations (7), obtained by the Euclidean
algorithm. We may rewrite the Euclidean algorithm (6) by
r
j1
r
j
=
_
r
j1
r
j
_
+
r
j+1
r
j
with 0 r
j+1
< r
j
. for j = 0, 1, . . . , n (18)
until r
n
,= 0. Setting a
j
=
_
r
j1
r
j
_
, we obtain
r
1
r
0
= a
0
+
_
r
0
r
1
_
1
= a
0
+
1
a
1
+
_
r
1
r
2
_
1
= . . . .
The rst of these identities yields the integral part as an approximation to the
rational
r
1
r
0
. Using more and more of these identities, we obtain better and better
approximations, and among them we can even nd all best approximations. We
shall give an example: a solar year has
365 days 5 hours 48 minutes and 45, 8 seconds 365 +
419
1730
days.
Unfortunately, this is not an integer, so how to create a good calendar? With view
to (18) we nd
1730
419
= 4 +
54
419
,
resp.
365 +
419
1730
= 365 +
_
1730
419
_
1
365 +
1
4
,
which is nothing else than Caesars calendar, i.e., a leap year each fourth year. By
the full Euclidean algorithm we get
365 +
419
1730
= 365 +
1
4 +
1
7 +
1
1 +
1
3 +
1
6 +
1
2
.
Using this without the last fraction
1
2
gives
365 +
419
1730
365 +
194
801
;
this our present calendar (six leap years are deleted in 800 years), the Gregorian
calender, introduced by pope Gregor XIII in 1582.
18
Exercise 11 The lunar month has 29.53 days; approximate the solar year by
365.24. Recover the Metonic cycle of seven leap month every 19 years from a good
approximation to
365.24
29.53
by use of the Euclidean algorithm. This is used in the jewish
calendar.
We call
a
0
+
1
a
1
+
1
a
2
+
...
+
1
a
n1
+
1
a
N
,
where a
0
Z and a
n
N for 1 n < N and a
N
1, a nite simple continued
fraction. Here means simple that all numerators are equal to one; it is possible to
allow other values but in what follows we will, more or less, deal only with simple
continued fractions; therefore we may omit the word simple in the sequel. The a
n
are called partial denominators. For brevity we denote the continued fraction
above [a
0
, a
1
, a
2
, . . . , a
N
] .
First, we shall consider [a
0
, . . . , a
N
] as a function in the variables a
0
, . . . a
N
. We
nd by induction on n
[a
0
, a
1
, . . . , a
n
] =
_
a
0
, a
1
, . . . , a
n1
+
1
a
n
_
(19)
= a
0
+
1
[a
1
, . . . , a
n
]
= [a
0
, [a
1
, . . . , a
n
]].
For n N we call [a
0
, a
1
, . . . , a
n
] the n-th convergent to [a
0
, a
1
, . . . , a
N
] and
dene
p
1
= 1 , p
0
= a
0
and p
n
= a
n
p
n1
+ p
n2
, (20)
q
1
= 0 , q
0
= 1 and q
n
= a
n
q
n1
+ q
n2
.
The computation of the convergents is easily ruled by means of
Theorem 10 The functions p
n
, q
n
satisfy
(i)
pn
qn
= [a
0
, a
1
, . . . , a
n
] ,
(ii) p
n
q
n1
p
n1
q
n
= (1)
n
,
(iii) p
n
q
n2
p
n2
q
n
= (1)
n
a
n
.
Proof We prove the rst assertion by induction on n. The cases n = 0 is obvious.
The case n = 1 is easily computed by
[a
0
, a
1
] =
a
1
a
0
+ 1
a
1
=
p
1
q
1
.
19
Now assume that the formula in question holds for n. Then, in view of (19) and
the recursion formulae for the p
n
, q
n
[a
0
, a
1
, . . . , a
n
, a
n+1
] =
_
a
0
, a
1
, . . . , a
n
+
1
a
n+1
_
=
_
a
n
+
1
a
n+1
_
p
n1
+ p
n2
_
a
n
+
1
a
n+1
_
q
n1
+ q
n2
=
a
n+1
p
n
+ p
n1
a
n+1
q
n
+ q
n1
=
p
n+1
q
n+1
,
which proves (i). In view of this we have
p
n
q
n1
p
n1
q
n
= (a
n
p
n1
+ p
n2
)q
n1
p
n1
(a
n
q
n1
+ q
n2
)
= (p
n1
q
n2
p
n2
q
n1
).
Repeating this argument with n 1, n 2, . . . , 2 proves (ii). Consequently, p
n
and
q
n
are coprime. Further,
p
n
q
n2
p
n2
q
n
= (a
n
p
n1
+ p
n2
)q
n2
p
n2
(a
n
q
n1
+ q
n2
)
= a
n
(p
n1
q
n2
p
n2
q
n1
),
and the last assertion follows from the second one.
The continuous fraction expansion is not unique since
[a
0
, a
1
, a
2
, . . . , a
N
] = [a
0
, a
1
, a
2
, . . . , a
N
1, 1],
but nearly unique. With regard to our previous observations it is easily seen
that any rational number has a representation as a nite continued fraction
[a
0
, a
1
, a
2
, . . . , a
N
] , which is unique if 1 < a
N
N. By Theorem 10 we get a
better understanding between the solutions of linear diophantine equations (7) and
best approximations:
Exercise 12 Calculate the continued fraction for
15
38
and compare its convergents
with the approximations found in Section 1. Give a further proof of Exercise 3 by
use of Theorem 10.
We can rewrite the algorithm for computing the continued fraction of a rational
=:
0
by the iteration

n
= [
n
] +
1

n+1
for n = 0, 1, . . . ,
and setting a
n
= [
n
] . Obviously, if is rational the iteration stops after nitely
many steps. Otherwise, if is irrational, then the iteration does not stop and we
get by this procedure
= [a
0
, a
1
, a
2
, . . .].
20
[a
0
, a
1
, a
2
, . . .] is an innite continued fraction but the rst thing we have to ask
is whether the underlying innite process is convergent? Theorem 10 remains valid
and with regard to its second assertion

p
n
q
n
=

n+1
p
n
+ p
n1

n+1
q
n
+ q
n1

p
n
q
n
=
(1)
n
q
n
(
n+1
q
n
+ q
n1
)
. (21)
Since the q
n
are strictly increasing for n 2, the sequence of convergents is alter-
nating
p
0
q
0
<
p
2
q
2
< . . . < < . . . <
p
3
q
3
<
p
1
q
1
.
Thus, we have
Theorem 11 If is irrational, then


p
n
q
n

<
1
q
n
q
n+1
;
in particular,
= lim
n
p
n
q
n
= [a
0
, a
1
, a
2
, . . .].
It is easily shown that the continued fraction expansion of any irrational number is
uniquely determined. Hence, conversely, we can introduce the set of real numbers
R via continued fractions!
With view to Theorem 11 it becomes visible what an important role continued
fractions play in the theory of diophantine approximations. Firstly, it gives a third
proof of Dirichlets approximation theorem. In view of (21) we have even
Exercise 13 Prove that among two consecutive convergents to any irrational
there is at least one satisfying


p
q

<
1
2q
2
.
Furthermore, if
p
q
is a solution of the latter inequality, then
p
q
is a convergent to .
One can go further. Actually, among three consecutive convergents to any irrational
there is at least one satisfying the inequality in Hurwitz theorem. Thus, we
obtain explicitly as many good approximations to a given irrational as we please.
For instance, a short computation gives the sequence of convergents to
[3] = 3, [3, 7] =
22
7
, [3, 7, 15] =
333
106
,
[3, 7, 15, 1] =
355
113
, [3, 7, 15, 1, 292] =
1 03993
33102
, . . . ,
which are located as follows
3 <
333
106
<
1 03993
33102
< . . . < < . . . <
355
113
<
22
7
;
we observe that the sequence of convergents covers the best approximations! This
is not a miracle as Lagrange proved in 1770:
21
Theorem 12 Let be any irrational number with convergents
pn
qn
. If n 2 and
p, q are positive integers satisfying 0 < q q
n
and
p
q
,=
pn
qn
, then
[q
n
p
n
[ < [q p[.
This so-called law of best approximations shows that we cannot do better than
approximating an irrational number by its convergents!
Proof. We may suppose that p and q are coprime. Since
[q
n
p
n
[ < [q
n1
p
n1
[,
it is sucient to prove the assertion under the assumption that q
n1
< q q ; the
full statement of the theorem follows then by induction on n. If q = q
n
, then p ,= p
n
and it follows that

p
q

p
n
q
n

1
q
n
.
But


p
n
q
n

1
q
n
q
n+1
<
1
2q
n
by Theorem 11. Thus


p
n
q
n

<


p
q

,
which implies the inequality in question after multiplication with q = q
n
. Now
suppose that q
n1
< q < q
n
. The system of linear equations
p
n
X + p
n1
Y = p and q
n
X + q
n1
Y = q
has the unique solution
x =
pq
n1
qp
n1
p
n
q
n1
p
n1
q
n
= (pq
n1
qp
n1
) , y =
pq
n
qp
n
p
n
q
n1
p
n1
q
n
= (pq
n
qp
n
).
Hence, x and y are integers, neither is zero. Obviously, x and y have opposite
sign. Since q
n
p
n
and q
n1
p
n1
have opposite sign as well, x(q
n
p
n
) and
y(q
n1
p
n1
) have the same sign. Since
q p = x(q
n
p
n
) + y(q
n1
p
n1
),
we obtain
[q p[ > [q
n1
p
n1
[ > [q
n
p
n
[,
which was to show.
22
5 The irrationality of (3)
The Riemann zeta-function is dened by
(s) =

n=1
1
n
s
=

p
_
1
1
p
s
_
1
for s > 1, where the so-called Euler product is taken over all prime numbers.
The identity between the series and the product may be regarded as an analytic
version of the unique prime factorization in Z. This analytic approach was found
by Euler in 1737 and gives a rst glance on the intimate relation between (s)
and the distribution of prime numbers. For instance, assume that there are only
nitely many primes, then the product converges throughout the complex plane, in
contradiction to the divergence of the harmonic series (i.e, the singularity of (s) at
s = 1). Thus, there are innitely many prime numbers. However, for more deeper
studies one has to consider (s) as a function of a complex variable. (see [24] for
more details on this amazing link between analysis and number theory).
The values of the zeta-function taken at the integers are of special interest in
number theory. Euler showed that
(2n) = (1)
n1
(2)
2n
2(2n)!
B
2n
for n N,
where B
k
is the k-th Bernoulli number dened by
z
exp(z) 1
=

k=0
z
k
k!
B
k
;
note that B
k
is rational. However, nearly nothing is known about the values taken
at the odd integers (this is related to the trivial zeros of (s) at s = 2n, n N).
It was a senstaion when Apery proved in 1978
Theorem 13 (3) is irrational.
We can only give a sketch of the proof; for the exciting story behind and more details
on the proof see [39]. The rst step is the recursion formula
n
3
u
n
+ (n 1)
3
u
n2
= (34n
3
51n
2
+ 27n 5)u
n1
for n 2.
We dene sequences a
n
and b
n
generated by this recursion and the initial values
a
0
= 0 , a
1
= 6 and b
0
= 1 , b
1
= 5.
It turns out that b
n
Z and a
n
Q with denominators dividing 2[1, 2, . . . , n]
3
,
where [1, 2, . . . , n] denotes here the least common multiple of 1, 2, . . . , n. Multiply-
ing this recursion formula with u = a by b
n1
and additionally the same with u = b
by a
n1
, we get
n
3
(a
n
b
n1
a
n1
b
n
) = (n 1)
3
(a
n1
b
n2
a
n2
b
n1
).
23
Since a
1
b
0
a
0
b
1
= 6 we obtain
a
n
b
n1
a
n1
b
n
=
6
n
3
. (22)
This leads to an example of a non-simple continued fraction
(3) =
6
5
1
6
117
...

n
6
34n
3
+ 51n
2
+ 27n + 5 . . .
;
this diers from the expansions which we investigated before but it is convergent
too. One can deduce from (22)

(3)
a
n
b
n

m=n+1
6
m
3
b
m
b
m1
= O
_
1
b
2
n
_
.
It is easy to compute by the recursion formula that
b
n

n
, where = (1 +

2)
4
.
By the prime number theorem, which is a weaker form of (17), it can be shown that
[1, 2, . . . , n] =

pn
p
[
log n
log p
]

pn
n = n
(n)
e
n
.
Now setting p
n
= 2a
n
[1, 2, . . . , n]
3
and q
n
= 2b
n
[1, 2, . . . , n]
3
(which are both inte-
gers by our remark above), we obtain q
n

n
e
3n
and
[q
n
(3) p
n
[ = O
_
q

n
_
with =
log 3
log + 3
= 0.08052 . . . .
This proves the theorem in view of Exercise 5.
Aperys discovery started intensive research on this topic with the aim to obtain
similar results for other values of (s). Beukers found a dierent proof for (3) , Q
by using Legendre polynomials, i.e., for n N the n-th derivative of the polynomial
dened in (2). However, it is still open whether (5) or any other value taken at
positive odd integers is irrational. Recently, Rioval proved that there are innitely
many irrational numbers in the set of odd zeta-values and Zudilin showed that at
least one of the four numbers (5), (7), (9) and (11) is irrational (cf. [61]).
6 Quadratic irrationals
Leonrado da Pisa (1170-1250) wrote the inuential book Liber abaci in which he
invented the arabic ciphers and zero to Europe. Furthermore, he introduced the
Fibonacci numbers
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . ,
24
dened by the recursion
F
0
:= F
1
:= 1 and F
n+1
= F
n
+ F
n1
for n N.
In view of (20) it follows immediately that
lim
n
F
n
F
n1
= [1, 1, 1, , . . .], (23)
but what is the value of this limit? If we denote this limit by x, then x = 1 +
1
x
,
which shows that x has to be a root of the quadratic polynomial X
2
X1 = (X
G)(X +g) which we already know from the proof of Hurwitz theorem. Thus, G =
[1, 1, 1, . . .] and g =
1
G
= [0, 1, 1, 1, . . .] . This observation gives some information on
te Fibonacci sequence. For instance, we obtain by Theorem 10
F
n
F
n2
F
2
n1
= (1)
n
for n N.
Another beautiful identity was found by Lucas in 1876 who showed
(F
m
, F
n
) = F
(m,n)
for m, n N.
The mathematical journal Fibonacci Quarterly is exclusively devoted to the Fi-
bonacci sequence.
Exercise 14 Prove Bernoullis formula
F
n
=
G
n
(g)
n

5
=
_
G
n

5
+
1
2
_
for n N,
and deduce for the generating series

m=0
F
m
z
m
=
z
1 z z
2
for [z[ < g.
As a second example we shall compute the continued fraction expansion for

2.
We nd

2 = 1 +
_
1

2 1
_
1
= 1 + (

2 + 1)
1
= 1 +
1
2 + (

2 1)
= [1, 2, 2, 2, . . .].
The paper size DINA4 has a very useful self-similarity property. It has approximately
the same proportion when folded in half. It is easy to show that if it would have the
same proportion, then this proportion would equal

2. By denition, A4 paper
has length 29.7 and width 21 centimeters. The proportion is a convergent to

2:
29.7
21
=
99
70
= [1, 2, 2, 2, 2, 2]

2 = [1, 2, 2, 2, . . .].
Both examples of quadratic irrationals from above have some pattern in common:
their continued fraction expansion is eventually periodic! We say that the continued
fraction [a
0
, a
1
, . . .] is periodic if there exists an integer k with a
n+k
= a
k
for all
suciently large n. We write
[a
0
, a
1
, . . . , a
r
, a
r+1
, . . . , a
r+k
] = [a
0
, a
1
, . . . , a
r
, a
r+1
, . . . , a
r+k
, a
r+1
, . . . , a
r+k
, . . .].
25
Exercise 15 i) Show for n N

n
2
+ 2 = [n, n, 2n].
ii) Find for each n N the continued fraction expansion for the positive root of the
polynomial
(6n
2
+ 1)X
2
+ 3n(8n
2
+ 1)X (12n
2
+ 1).
Hint: start with the case n = 1, 2 and search for a pattern!
The pattern of periodicity is overwhelming.
Theorem 14 A number RQ is quadratic irrational if and only if its continued
fraction expansion is eventually periodic.
This is Lagranges theorem; its characterization of quadratic irrationals should be
compared with the irrationality criterion for real numbers which says that a real
number is rational if and only if its decimal fraction expansion is eventually periodic.
Here is the very nice
Proof. First, assume that = [a
0
, a
1
, . . . , a
k1
] . Then
=
p
k1
+ p
k2
q
k1
+ q
k2
,
resp.
q
k1

2
+ (q
k2
p
k1
) p
k2
= 0.
Since is irrational the polynomial q
k1
X
2
+(q
k2
p
k1
)X p
k2
is irreducible,
and thus its root is quadratic irrational. Now suppose that
= [a
0
, a
1
, . . . , a
r
, a
r+1
, . . . , a
r+k
]
= [a
0
, a
1
, . . . , a
r
, ] with = [a
r+1
, . . . , a
r+k
].
We have already proved that is quadratic irrational. Hence
=
p
r
+ p
r1
q
r
+ q
r1
is quadratic irrational as well (since Q() is a eld).
Conversely, assume that = [a
0
, a
1
, . . . , a
n1
,
n
] is quadratic irrational. Then
there exist a, b, c Z such that a
2
+ b + c = 0. Substituting
=

n
p
n1
+ p
n2

n
q
n1
+ q
n2
we obtain
A
n

2
n
+ B
n

n
+ C
n
= 0, (24)
26
where
A
n
= ap
2
n1
+ bp
n1
q
n1
+ cq
2
n1
,
B
n
= 2ap
n1
p
n2
+ b(p
n1
q
n2
+ p
n2
q
n1
) + 2cq
n1
q
n2
,
C
n
= ap
2
n2
+ bp
n2
q
n2
+ cq
2
n2
.
Note that A
n
= 0 would imply that the polynomial in (24) has the root
pn
qn
, which
contradicts , Q. Thus, A
n
X
2
+ B
n
X + C
n
is a quadratic polynomial with root

n
. A short calculation with regard to Theorem 10 shows that its discriminant
coincides with the discriminant of (24):
B
2
n
4A
n
C
n
= (b
2
4ac)(p
n1
q
n2
p
n2
q
n1
) = b
2
4ac. (25)
In view of Theorem 11
p
n1
= q
n1
+

n1
q
n1
with [
n1
[ < 1.
Therefore,
A
n
= a
_
q
n1
+

n1
q
n1
_
2
+ bq
n1
_
q
n1
+

n1
q
n1
_
+ cq
2
n1
= (a
2
+ b + c)q
2
n1
+ 2a
n1
+ a

2
n1
q
2
n1
+ b
n1
= 2a
n1
+ a

2
n1
q
2
n1
+ b
n1
.
It follows that [A
n
[ < 2[a[ + [a[ + [b[ . Since C
n
= A
n1
the same estimate holds
for C
n
as well. Finally, by (25)
B
2
n
4[A
n
C
n
[ +[b
2
4ac[ < 4(2[a[ +[a[ +[b[)
2
+[b
2
4ac[.
Since the upper bounds for A
n
, B
n
and C
n
do not depend on n, there are only
nitely many dierent triples A
n
, B
n
, C
n
. Thus we can nd a triple A, B, C which
occurs at least three times, say as A
n
1
, B
n
1
, C
n
1
and A
n
2
, B
n
2
, C
n
2
and A
n
3
, B
n
3
, C
n
3
.
Consequently,
n
1
,
n
2
and
n
3
are all roots of the quadratic polynomial AX
2
+
BX + C, and at least two of them must be equal. If, for example,
n
1
=
n
2
then
a
n
1
= a
n
2
, a
n
1
+1
= a
n
2
+1
, . . . . This proves the theorem.
Exercise 16 Give an estimate for the length of the period of the continued fraction
expansion of a quadratic irrational.
In particular, the partial denominators of quadratic irrationals are bounded.
Nearly nothing is known about the continued fraction expansions of cubic irrationals
or even transcendental numbers in general. For such numbers it is conjectured that
their sequence of partial denominators is unbounded. This topic is investigated in
27
the metric theory of continued fraction. For instance, one can show that the set of
real numbers with bounded partial denominators has measure null. On the other side
one can prove that if (n) is any positive function, then the inequality
a
n
= a
n
() (n)
holds for almost all real if and only if the series

n=1
(n)
diverges. One of the highlights is the limit theorem of Gauss-Kusmin: if meas
n
(x)
denotes the measure of the set of (0, 1) for which [a
n
, a
n+1
, . . .] a
n
< x, then
lim
n
meas
n
(x) =
log(1 + x)
log 2
for any x (0, 1).
It should be noted that Gauss stated this result in a letter to Laplace but never
published a proof; the rst published proof was given by Kusmin in 1928. For
details to this deep result we refer the interested reader to [27]. It is rather dicult
to say anything on sums or products of continued fractions. In 1947 Hall [20] showed
that any real number can be represented as sum of two irrationals whose continued
fractions only contain partial denominators 3; this beautiful result is related to
the geometry of Cantor sets. Another interesting problem is the representation of
continued fractions as sums of unit fractions; this is related to the Erdos-Strauss
conjecture which claims that every positive integer n > 1 there exist a, b, c N
such that
4
n
=
1
a
+
1
b
+
1
c
.
In [14] it is shown, among others, that for any given positive integer m there exists
a continued fraction of length m being the sum of two unit fractions.
7 The continued fraction for e
In 1748 Euler found the continued fraction of e but he had no full proof. We shall
ll the gaps.
Theorem 15 For k N,
exp
_
2
k
_
+ 1
exp
_
2
k
_
1
= [k, 3k, 5k, 7k, . . .].
Proof. For any non-negative integer n dene

n
=
1
n!
_
1
0
x
n
(1 x)
n
exp
_
2x
k
_
dx,

n
=
1
n!
_
1
0
x
n+1
(1 x)
n
exp
_
2x
k
_
dx;
28
the integrands should be compared with the functions f
n
which we used in the proof
of the irrationality of e. It is not dicult to compute

0
=
k
2
_
exp
_
2
k
_
1
_
,
0
=
k
2
exp
_
2
k
_

_
k
2
_
2 _
exp
_
2
k
_
1
_
. (26)
Exercise 17 Show for n N that
2
k

n
+
n1
= 2
n1
and k(2n + 1)
n
= k
n1
2
n
.
With regard to these formulae we may eliminate the
n
s:
2
k

n+1
+ k(2n + 1)
n
=
k
2
_
2
k

n
+
n1
_

n
=
k
2

n1
,
resp.
2
n+1
k
n
+ (2n + 1)k =
k
n1
2
n
. (27)
In view of (26) we deduce

1
=
k
2
(2
0

0
) =
_
k
2
_
2 _
exp
_
2
k
_
+ 1 k
_
exp
_
2
k
_
1
__
.
This leads to
exp
_
2
k
_
+ 1
exp
_
2
k
_
1
=
_
2
k
_
2

1
+ k
_
exp
_
2
k
_
1
_
2
k

0
= k +
2
1
k
0
.
It is easily seen that 2
n+1
< k
n
for n N, which yields in view of (27)
exp
_
2
k
_
+ 1
exp
_
2
k
_
1
= k +
_
k
0
2
1
_
1
=
_
k, 3k,
k
1
2
2
_
= . . . .
This proves the theorem.
In particular, with regard to Lagranges theorem we nd a slightly improvement of
Theorem 1:
Exercise 18 Prove that e is neither rational nor quadratic irrational.
In what follows we exactly follow Euler in computing the continued fraction
expansion for e. Taking k = 2 in Theorem 15 we get
e + 1
e 1
= [2, 6, 10, 14, . . .];
denote its n-th convergents by
pn
qn
. Further, dene the real number E by E =
[A
0
, A
1
, A
2
, . . .] , where
A
0
= 2, A
3n2
= A
3n
= 1 and A
3n+1
= 2n,
and let
Pn
Qn
be the n-th convergent to E.
29
Exercise 19 Prove for n N the identities
P
3n+1
= p
n
+ q
n
and Q
3n+1
= p
n
q
n
.
Hint: for 3 j 1 write down P
3n+j
in terms of P
3n+j1
and P
3n+j2
, and multiply the
resulting ve equations by 1, 1, 2, 1 and 1. Then the identity for P
3n+1
follows by induction; the
formula for Q can be proved analogously.
This implies
E = lim
n
P
3n+1
Q
3n+1
= lim
n
pn
qn
+ 1
pn
qn
1
=
e+1
e1
+ 1
e+1
e1
1
= e.
Thus, we have proved
Theorem 16 e = [2, 1, 2, 1, 1, 4, 1, . . .] .
Eulers approach to continued fractions was rather dierent. He expanded a certain
generalization of the exponential function into a continued fraction (see [29]). This
idea can be regarded as the starting point of the so-called Pade approximations,
which is the theory of approximating transcendental function by rational function.
It is an easy consequence that e is not approximable by an order > 2:
Exercise 20 Show that

e
p
q

>
c
q
2
log q
for all
p
q
Q with q > 1 and some positive constant c.
It should be noted that the continued fraction for shows no pattern so far.
Curiously, if we switch to another expansion we may nd patterns.
God loves the odd integers. (Leibniz)
From the Leibniz series

4
= 1
1
3
+
1
5

1
7
. . . ,
Euler deduced the continued fraction expansion

4
=
1
1 +
1
2 +
9
2 +
...
+
(2n + 1)
2
2 +
...
.
30
8 Markos spectrum
The constant

5 in Hurwitz theorem depends mainly on best approximations of


the golden ratio = G. The same argument as in its proof suggests that in case of
=

2 the constant in Hurwitz theorem can be rened to the larger constant

8
(see Exercise 10).
For this purpose we have to introduce some new denitions. We call two real
numbers and equivalent if there are integers a, b, c, d such that
=
a + b
c + d
with ad bc = 1. (28)
Such a mapping z
az+b
cz+d
is called unimodular transformation (that is a special
Mobius transformation). They build the group of unimodular transforma-
tions (for details on the beautiful theory of modular transformations and their
geometry see [30]). Therefore, equivalence of real numbers is symmetric, transitive
and reexive. In view of Theorem 10 each real number is equivalent to any of
its tails
n
, more precisely: = [a
0
, a
1
, . . . , a
n1
,
n
] may be regarded as n suc-
cessive modular transformations. It is easily seen that any two rational numbers
are equivalent (it suces to show that each rational is equivalent to zero). The
interpretation of the real line via the actions of modular transformations provides a
new view on our previous observations on the Farey sequence and on the continued
fraction algorithm.
Exercise 21 Prove that a real number = [a
0
, a
1
, . . . , a
n1
, ] , where n 2, has
a representation by a modular transformation (28) if and only if c > d > 0.
Hint: use induction on d for the converse implication.
Now we are in the position to compare the continued fraction expansions of
equivalent numbers.
Theorem 17 Two irrational numbers and are equivalent if and only if their
continued fraction expansions are eventually identical; more precisely: there exist
positive integers m, n and a real number > 1 such that
= [a
0
, . . . , a
n
, ] and = [b
0
, . . . , b
m
, ].
Proof. By Theorem 10,
=
p
n
+ p
n1
q
n
+ q
n1
,
where p
n
q
n1
p
n1
q
n
= 1. Hence, and are equivalent. Similarly, and
are equivalent. Since the M obius transformations form a group, and are also
equivalent. Conversely, assume that and are equivalent. Hence, the identity
(28) holds with c > d > 0 (by the previous exercise). This together with
= [b
0
, . . . , b
m
, ] =
p
m
+ p
m1
q
m
+ q
m1
31
implies
=
P + R
Q + S
,
where
P = ap
m
+ bq
m
, R = ap
m1
+ bq
m1
, Q = cp
m
+ dq
m
, S = ap
m1
+ bq
m1
,
and PS QR = 1 (this is easily seen by considering modular transforms as
matrices). In view of Theorem 11
p
mj
= q
mj
+

j
q
mj
with [
j
[ < 1
for j = 0, 1. Consequently,
Q = (c + d)q
m
+
c
0
q
m
> (c + d)q
m1
+
c
1
q
m1
= S
for suciently large m. Obviously, we may assume without loss of generality that
c + d > 0, which implies S > 0. With view to Exercise 21 it follows that =
[a
0
, . . . , a
n
, ] , which was to show.
Now let |x| = min[x z[ : z Z. The Marko constant for is dened
by
M() = liminf
q
q|q|.
In view of Hurwitz theorem we have M() M(G) =
1

5
for any real (in view
of (23), which supports our observation M(g) 0.477 . . . from Exercise 9). We
need a more explicit representation of the Marko constant. Taking into account
(21) we have

p
n
q
n
=
1

n
q
2
n
, where
n
= (1)
n
_

n+1
+
q
n1
q
n
_
and
n+1
= [a
n+1
, a
n+2
, . . .] . Since
q
n1
q
n
=
1
q
n
/q
n1
=
1
a
n
+
q
n2
q
n1
= [0, a
n
, a
n1
, . . . , a
1
],
we deduce
M() = liminf
n
1

n
= liminf
n
1
[a
n+1
, a
n+2
, . . . , ] + [0, a
n
, a
n1
, . . . , a
1
, a
0
]
.
In view of Theorem 17 we obtain
Theorem 18 If and are equivalent, then M() = M().
32
In particular, if is irrational with positive M(), then there exist innitely
many rationals
p
q
for which


p
q

M()
q
2
.
This yields the following renement of Hurwitz theorem. Let R Q. Then
M() =
1

5
if and only if is equivalent to G. If is not equivalent to G, then
M() M(

2) =
1

8
; this can be proved along the lines of the proof of Theorem
9. This shows that all real numbers, which are equivalent to the golden ratio,
that are those which have an eventually periodic continued fraction with period 1,
are the worst approximable irrationals (this should be compared with Exercise 2).
This is only the rst step as Marko observed. We could continue this process to
nd successively smaller Marko constants by further restrictions on the continued
fraction expansions. The set of all Marko constants is the Marko spectrum.
This leads to
G = [1] with M(G) =
1

5
= 0.44721 . . . ,
1 +

2 = [2] with M(1 +

2) =
1

8
= 0.35355 . . . ,

9+

221
10
= [2, 2, 1, 1] with M
_
9+

221
10
_
=
5

221
= 0.33633 . . . ;
for a longer list see [9]. There is a deep theorem of Marko [32] which states that
the Marko spectrum above
1
3
consists exactly of numbers of the form
z

9z
2
4
,
where z is a positive integer such that there exist x, y N with maxx, y z
satisfying the diophantine equation
X
2
+ Y
2
+ Z
2
= 3XY Z.
Unfortunately, this is far beyond our scope. Further, it can be shown that any
positive real number less than Freimans number
153640040533216 19623586058

462
693746111282512
= 0.22085 . . .
is in the Marko spectrum. See [49] for a nice survey on the Marko spectrum and
its interaction with hyperbolic geometry.
Exercise 22 Compute the Marko constant for =

n
2
+ 2 for n N.
Hint: see Exercise 15.
33
We call a number badly approximable if M() > 0, i.e. there exists a
positive constant c, depending only on , such that for all
p
q


p
q

>
c
q
2
holds. The following theorem classies badly approximable numbers.
Theorem 19 An irrational is badly approximable if and only if its partial quo-
tients are bounded.
For instance, quadratic irrationals are badly approximable, but not e (see Theorem
16 and Exercise 20).
Proof. In view of (21)
1
(a
n+1
+ 2)q
2
n


p
n
q
n

1
a
n+1
q
2
n
. (29)
By Theorem 12, the law of best approximations, other rationals cannot approximate
better than convergents do. This proves the theorem.
9 The Pell equation
In a letter to Eratosthenes Archimedes (287 212 B.C.) posed the so-called cattle
problemin which he asks for the number of bulls and cows belonging to the Sun god,
subject to certain arithmetical restrictions. This problem was already forgotten over
the centuries until it was rediscovered by G.E. Lessing in the Wolfenb uttel library
in 1773. A nice English version goes as follows:
The Sun gods cattle, apply thy care to count their numbers, hast thou wisdoms
share. They grazed of old on the Thrinacian oor of Siclys island, herded in to
four, colur by colour: one herd white as cream, the next in coats glowing with ebon
gleam, brown-skinned the third, and stained with spots the last. Each herds saw
bulls in power unsurpassed, in ratios these: count half the ebon-hued, add one third
more, then all the brown include; thus, friend, canst thou the white bulls number
tell. The ebon did the brown exceed as well, now by a fourth and fth part of the
stained. To know the spotted - all bulls that remained - reckon again the brown
bulls, and unite these with a sixth and seventh of the white. Among the cows, the
tale of silver-haired was, when with bulls and cows of black compared, exactly one
in three plus one in four. The black cows counted one in four once more, plus now
a fth, of the bespeckled breed when, bulls withal, they wandered out to feed. The
speckled cows tallied a fth and sixth of all the brown-haired, males and females
mixed. Lastly, the brown cows numbered half a third and one in seven of the silver
herd. Tellst thou unfailingly how many head the Sun possessed, o friend, both bulls
34
well-fed and cows of evry colour - no-one will deny thou hast numbers art and
skill, though not yet dost thou rank among the wise. But come! also the follwing
recognise. Wheneer the Sun Gods white bulls joined the black, their multitude
would gather in a pack of equal length and breadth, and squarely throng Thrinacias
territory broad and long. But when the brown bulls mingled with the ecked, in
rows growing from one would they collect, forming a perfect triangle, with neer a
dirent-coloured bull, and none to spare. friend, canst thou analyse this in thy
mind, and of these masses all the measures nd, go forth in glory! be assured all
deem thy wisdom in this discipline supreme! (cf. [1])
Denoting by w, x, y and z the numbers of the white, black, dappled, and brown
bulls, and by , A, ] and 2 the numbers of the white, black, dappled, and brown
cows, respectively, one has to solve the system of linear diophantine equations
w =
_
1
2
+
1
3
_
x + z , x =
_
1
4
+
1
5
_
y + z , y =
_
1
6
+
1
7
_
w + z, (30)
and
=
_
1
3
+
1
4
_
(x +A) , A =
_
1
4
+
1
5
_
(y +]) , (31)
] =
_
1
5
+
1
6
_
(z +2) , 2 =
_
1
6
+
1
7
_
(w +).
Due to Archimedes everyone who can solve this problem is merely competent; but
to win the prize for supreme wisdom one has to meet the additional condition that
w + x has to be a square, and that y + z has to be a triangular number, i.e., a
number of the form 1+2+. . . +n =
1
2
n(n+1) (imagine the numbered balls in pool
billiard). It is easily seen that the general solution to (30) is given by
(w, x, y, z) = m (2226, 1602, 1580, 891), where m N.
It turns out that the system (31) is solvable if and only if m is a multiple of 4657.
Setting m = 4657 M we obtain for the general solution of (31)
(, A, ], 2) = M (7206360, 4893246, 3515820, 5439213), where / N.
This solves the rst part of the cattle problem. To solve also the second part one has
to nd an / such that w+x = 46573828/ is a square and y+z = 46572471/
is triangular. By the prime factorization 4657 3828 = 2
2
3 11 29 4657 it follows
that w + x is a square if and only if / = 3 11 29 4657 Y
2
, where Y N.
Since y +z is triangular if and only if 8(y +z) +1 is a square, one has to solve the
quadratic equation
X
2
410 286 423 278 424 Y
2
= 1,
where 410 286 423 278 424 = 2 3 7 11 29 353 (2 4657)
2
, which does not look too
easy. It seems that the ancient Greek were unable to solve this equation; see [31] or
[47] for nicely written surveys on the cattle problem, its history and the rst prize
winners for supreme wisdom.
35
We shall even solve a more general problem. The Pell equation is dened by
X
2
dY
2
= 1, where d N. (32)
It should be noted that Pell was an English mathematician who lived in the sev-
enteenth century but he had nothing to do with this equation (it was Euler who
mistakenly attributed a solution method to Pell which in fact was found by Pells
contemporaries Wallis and Lord Brouncker). We are interested in integral solutions.
In some sense, we have to nd the set of intersections of a hyperbola with the lat-
tice Z
2
. Obviously, x = 1 and y = 0 is always a solution, but are there more?
By symmetry it suces to look for solutions in positive integers. If d is a perfect
square, we can factor the left hand side and it turns out that (32) has only nitely
many integral solutions. Thus, the case of a perfect square d is boring and we may
assume in the sequel that d is not a perfect square, resp.

d , Q.
Our rst deeper observation is due to Euler. Assume that x, y is a solution of
(32). Then we may factor the left hand side of (32) which leads to
(x y

d)(x + y

d) = x
2
dy
2
= 1,
resp.

d
x
y

=
1
y
2
(

d +
x
y
)
<
1
2y
2
.
In view of Exercise 13 all solutions of (32) can be found among the convergents to

d. For instance, the sequence of convergents


pn
qn
to

2 starts with
1
1
,
3
2
,
7
5
,
17
12
,
41
29
,
99
70
, . . .

2 = [1, 2],
and in fact we get for p
2
n
2q
2
n
the values
1
2
2 1
2
= 1, 3
2
2 2
2
= +1, 7
2
2 5
2
= 1,
17
2
2 12
2
= +1, 41
2
2 29
2
= 1, 99
2
2 70
2
= +1.
The regularity is astonishing! It suggests to consider instead of (32) the more general
quadratic equation
X
2
dY
2
= 1; (33)
according to the sign we speak about the plus and the minus equation, respectively.
Exercise 23 Consider the the cases d = 11, 13. Write down the convergents
pn
qn
to

d for the rst two periods and compute the values p


2
n
dq
2
n
. Give a conjecture
how the set of solutions of (33) looks like.
36
It is easily seen that if d is a multiple of a prime p 3 mod 4, then the minus
equation is unsolvable (since squares are congruent 0, 1 mod 4).
From a mathematical point of view the Pell equation is important with respect
to its role in algebraic number theory. A complex number is said to be alge-
braic over Q if there exists a polynomial P(X) with rational coecients such
that P() = 0; the polynomial with leading coecient 1 and minimal degree hav-
ing this property is called minimal polynomial of and we denote it by P

(X)
(this generalizes our concept of quadratic irrationals); the degree of the minimal
polynomial is said to be the degree of ; for short d := deg = deg P

. The set
Q() = a
0
+ a
1
+ . . . + a
d1

d1
: a
j
Q
is a nite algebraic extension of the eld of rational numbers, the algebraic number
eld associated to ; it has degree d = [Q() : Q()] (which may be regarded
as the dimension of the rational vector space Q()). Note that any number in
Q() is algebraic. The zeros

1
, . . . ,

d
of the minimal polynomial P

(X) are the


conjugates of (i.e. the images of under the eld automorphisms). The product
of all conjugates is the norm of :
N() :=
d

j=1

j
;
note that this is, up to a sign, equal to the constant term P

(0) in the minimal


polynomial; the norm provides a measure for the size of algebraic numbers. An
algebraic number is said to be an algebraic integer if its minimal polynomial has
integral coecients. The set of all algebraic integers in a number eld form a ring,
the so-called ring of integers. Unfortunately, and somehow surprisingly, these
rings in general do not have a unique prime factorization any longer. For example,
2 3 = (1

5) (1 +

5)
gives two distinct factorizations of 6 in the ring Z[

5] into irreducible factors


(one can overcome this problem by introducing prime ideals but this is another
story). For an understanding of the structure of number elds it is important to
study its integers and, in particular, its units. An algebraic integer is called unit if
the absolute value of its norm is equal to one.
In case of deg = 2 we call Q() a quadratic number eld. It is easily seen that
there always exists d Z, which is not a perfect square, such that Q() = Q(

d).
We say that Q(

d) is a real or imaginary quadratic number eld according to

d R or not. One can show that the ring of integers equals Z[

d] if d
2, 3 mod 4, and Z[
1+

d
2
] otherwise. For example, the golden ratio G =
1+

5
2
is an
algebraic integer in Q(

5). The conjugates of algebraic integers are of the form


x y

d, where x, y are rational with denominators 2. Therefore, the units of


a real quadratic number eld are given by the related rational solutions of the Pell
equation
x
2
dy
2
= (x + y

d)(x + y

d)

= N(x + y

d) = 1,
37
resp. to the integral solutions of the slightly more general equation X
2
dY
2
=
1, 4 according to the residue class of d modulo 4. An imaginary quadratic
number eld has only nitely many units, each of them being a root of unity. For
more details see [10] or [21].
Before we return to the Pell equation we need to introduce the notion of re-
ducibility. A quadratic irrational is called reduced if > 1 and 1 <

< 0.
The following important result is due to Galois.
Theorem 20 The continued fraction expansion of a quadratic irrational number
is purely periodic if and only if is reduced.
We prove only the implication which we shall use later on.
Proof. Assume that = [a
0
, a
1
, . . . , a
n1
,
n
] is reduced. Then, for n = 0, 1, . . . ,

n
= a
n
+
1

n+1
and

n
= a
n
+
1

n+1
.
Consequently,
n
> 1. If

n
< 0 it follows that
1 <

n+1
=
1

n
a
n
< 0.
Hence, by induction, all
n
are reduced. In particular,
0 <
1

n+1
a
n
< 1 , resp. a
n
=
_

n+1
_
.
Since is quadratic irrational, by Lagranges theorem its continued fraction is
eventually periodic. Thus there exist k < l for which
k
=
l
. It follows that
a
k
= a
l
and that

k
=

l
. Thus,
a
k1
=
_

k
_
=
_

l
_
= a
l1
.
We conclude by induction that the continued fraction of is purely periodic.
Exercise 24 Assume that = [a
0
, . . . , a
n
] is a quadratic irrational. Show that
[a
n
, . . . , a
0
] =
1

.
Deduce the other implication in Galois theorem.
Hint: It suces to show that if [a, b
1
, . . . , b
n
] is reduced, then a = b
n
.
Now we can give a detailed description of the continued fraction expansion of

d due to Legendre and Lagrange.


38
Theorem 21 If d is not a perfect square, then

d = [[

d], a
1
, . . . , a
N1
, 2[

d]].
Proof. Obviously, 1 < [

d]

d < 0. Consequently,

d + [

d] > 1 is reduced,
and hence by Galois theorem purely periodic:

d + [

d] = [2[

d], a
1
, . . . , a
N1
] = [2[

d], a
1
, . . . , a
N1
, 2[

d]].
This implies the representation given in the theorem.
In what follows N = N(d) denotes the minimal length of the periods of the
continued fraction expansion of

d. We write

d = [[

d], a
1
, . . . , a
n
,
n+1
],
then

1
=
1

d [

d]
=

d + [

d]
d [

d]
2
=:
P
1
+

d
Q
1
,
where P
1
, Q
1
are integral and Q
1
divides d P
2
1
. Now assume that

n
=
P
n
+

d
Q
n
, where P
n
, Q
n
Z, Q
n
[ (d P
2
n
). (34)
Then it follows that

n+1
=
1

n
a
n
=
Q
n
P
n
a
n
Q
n
+

d
=
Q
n
(P
n
a
n
Q
n

d)
(P
n
a
n
Q
n
)
2
d
=:
P
n+1
+

d
Q
n+1
,
where P
n+1
= a
n
Q
n
P
n
and
Q
n+1
=
d (P
n
a
n
Q
n
)
2
Q
n
=
d P
2
n
Q
n
. .
Z
+2a
n
P
n
a
2
n
Q
n
are integral. Since
Q
n
=
d (P
n
a
n
Q
n
)
2
Q
n+1
=
d P
2
n+1
Q
n+1
,
it follows that Q
n+1
divides d P
2
n+1
. Hence, by induction on n we see that each

n
has a representation of the form (34). Consequently,

d =

n
p
n1
+ p
n2

n
q
n1
+ q
n2
=
(P
n
+

d)p
n1
+ p
n2
Q
n
(P
n
+

d)q
n1
+ q
n2
Q
n
,
resp.

d((P
n
+

dQ
n
)q
n1
+ q
n2
Q
n
) = (P
n
+

dQ
n
)p
n1
+ p
n2
Q
n
.
39
Since

d , Q, splitting the latter identity into its rational and its irrational parts
yields
dq
n1
= P
n
p
n1
+ p
n2
Q
n
and p
n1
= P
n
q
n1
+ q
n2
Q
n
.
Multiplying the rst one by q
n1
and the second one by p
n1
, subtraction of both
equations gives with regard to Theorem 10
p
2
n1
dq
2
n1
= Q
n
(p
n1
q
n2
p
n2
q
n1
) = (1)
n
Q
n
. (35)
This should be compared with Exercise 23. In particular, if n is multiple of the
minimal period N, then
P
kN
+

d
Q
kN
=
kN
= [0, a
1
, . . . , a
N1
] =

d [

d],
by (34). Consequently, Q
kN
= 1. Thus, the plus-equation has innitely many
solutions in positive integers x
k
, y
k
given by
(x
k
, y
k
) =
_
(p
kN1
, q
kN1
) if N 0 mod 2,
(p
2kN1
, q
2kN1
) if N 1 mod 2.
(36)
The minus-equation has innitely many solutions x
k
= p
(2k1)N1
, y
k
= q
(2k1)N1
if N is odd; actually, it can be shown that the minus equation is solvable if and only
if the minimal period N is odd; for another criterion see also Exercise 39 below.
However, we focus on the plus-equation and ask whether there are more solutions
than those given above? With regard to (35) we have to ask for solutions of Q
n
= 1.
It can be shown that Q
n
is equal to one if and only if n is a multiple of the minimal
period N. The main idea behind is that the numbers
0
,
1
, . . . ,
N1
are pairwise
distinct (since N is the minimal period length), which implies that Q
n
= 1 holds
only whenever n is a multiple of N. However, for the rather technical proof we
refer the interested reader to [38]. The solution A, ] N of (32) with minimal
A is called minimal solution. By the remark from above, one can show that
(A, ]) = (x
1
, y
1
), where the right hand side is the solution of (32) according to
(36). Anyway, it is clear that A p
2N1
. Hence, solving the Pell equation is
reduced to a nite problem. However, the length of periods of surds

d can be
rather long; see [31] for fast algorithms.
It is a well-known fact from algebra that the units of a ring form a multiplicative
group. Therefore, it is not surprising, that the the solutions of the Pell equation
dene a group. In fact, we shall show that this group is cyclic, generated by the
minimal solution. First, we return to our example with d = 2. We observe that
(3 +2

2)
2
= 17 +12

2 and 17
2
2 12
2
= (3
2
2 2
2
)
2
= 1.
This leads to
Theorem 22 The Pell equation has innitely many solutions x, y Z, all of them
are up to the sign given by powers of the minimal solution A, ] N:
x + y

d := (A +]

d)
n
, where n = 0, 1, 2, . . . .
40
Proof. In view of
1
x + y

d
=
(x y

d)
x
2
dy
2
= x y

d (37)
it suces to show that all solutions of (32) in positive integers are given by x+y

d =

n
, where := A +]

d and n N. For any positive solution we can nd an n N


such that

n
x + y

d <
n+1
.
Dene
X + Y

d =
n
(x + y

d),
then we have to prove that the latter expression is equal to one. Since

d is
irrational, conjugation (37) leads to
X Y

d =
n
(x y

d).
Multiplying the latter equation with the previous one, we deduce
X
2
dY
2
=
n+n
(x y

d)(x + y

d)
. .
=x
2
dy
2
=1
= 1.
Suppose now that 1 < X + Y

d < , then, again with (37),


0 <
1
< (X + Y

d)
1
= X Y

d < 1.
It follows that
2X = (X + Y

d) + (X Y

d) > 1 +
1
> 0,
2Y

d = (X + Y

d) (X Y

d) > 1 1 = 0.
Thus, X and Y are positive integral solutions of the Pell equation with 1 < X +
Y

d < . Since x+y

d increases with y, we get Y < ] and X < A , contradicting


the fact that A, ] is the fundamental solution. It follows that X +Y

d = 1. The
assertion of the theorem follows.
In the following table some continued fraction expansions of

d and the related


minimal solutions (A, ]) are listed:

2 = [1, 2] : (3, 2),

3 = [1, 1, 2] : (2, 1),

5 = [2, 4] : (9, 4),

19 = [4, 2, 1, 3, 1, 2, 8] : (170, 39),

99 = [9, 1, 18] : (10, 1),

109 = [10, 2, 3, 1, 2, 4, 1, 6, 6, 1, 4, 2, 1, 3, 2, 20] : (158070671986249, 15140424455100),

2002 = [44, 1, 2, 1, 9, 5, 6, 9, 1, 2, 1, 88] : (11325887, 253128).


41
It can be shown that the minimal solution of the Pell equation related to
the cattle problem leads to an herd consisting of N = 77602 . . . 81800 many
bulls and cows, where the number N has 206545 digits. On the webpage
www.bioinfo.rpi.edu/ zukerm/cgi bin/dq.html the interested (lazy?) reader can nd
an online Pell solver. The size of the minimal solution of the Pell equation behaves
quite unregularly by increasing d. This is an interesting topic related to several
deep and important questions; for example, the class number problem.
Another related topic are polynomial Pell equations. For instance, let d = 1
or d = 2, then in [36] it is proved that the sequences of polynomials given by
P
0
= 1, Q
0
= 0 and, for n N,
P
n
(X) =
_
2
d
X
2
+ 1
_
P
n1
(X) +
2
d
X(X
2
+ d)Q
n1
(X),
Q
n
(X) =
2
d
XP
n1
(X) +
_
2
d
X
2
+ 1
_
Q
n1
(X)
satisfy the polynomial equation P(X)
2
(X
2
+d)Q(X)
2
= 1, and that these are (up
to the sign) all solution of the latter equation. However, it is an open problem to
determine the polynomials D for which the polynomial Pell equation P
2
DQ
2
= 1
has non-trivial solutions.
Since it has nothing to do with Pell or the Pell equation, we conclude with
another interesting innite representation due to Ramanujan:
Exercise 25 Prove that
n + 1 =
_
1 + n
_
1 + (n + 1)
_
1 + (n + 2)

. . . for n N.
10 Factorization with continued fractions
We are living in times with exponentially increasing electronic information exchange.
This led to so-called public key-cryptosystems which mainly rely on so-called one-
way functions, i.e. a simple operation which has an inverse that can hardly be
computed. It is simple to multiply integers but, conversely, it is rather dicult
to nd the prime factorization of a given large integer; it is conjectured that the
factorization of integers is an NP-hard problem, i.e., roughly speaking, there exists
no fast algorithm. In the sequel we follow [10] and [28].
One of the rst modern factorization methods is the Continued fraction
method CFRAC due to Lehmer and Powers. This algorithm was rst implemented
by Brillhart and Morrison in 1970. As a proof for the power of it they computed the
factorization of the 38-digit seventh Fermat number F
7
:= 2
2
7
+1; for the current
knowledge on Fermat numbers see www.prothsearch.net/fermat.htmlPrime. In the
following years CFRAC became the main factoring algorithm in practice; actually
it was the rst algorithm of expected sub-exponential running time. CFRAC relies
in the main part on some ideas due to Fermat, Legendre and Kraitchik. Suppose
42
that we are interested in the prime factorization of a large integer N. If there are
integers x, y for which
x
2
y
2
mod N and x , y mod N,
then the greatest common divisor (N, x + y) is a non-trivial factor of N. This
follows immediately from the identity x
2
y
2
= (x y)(x + y). To look randomly
for pairs of squares which satisfy these conditions seems to be hopeless. The main
idea is to collect suciently many squares which lie mod N in the same residue
class, such that certain combinations among them lead to non-trivial divisors of N.
More precisely, having suciently many congruences
x
2
j
(1)

0j

1j
1

2j
2
. . .

mj
m
mod N,
where the
k
are small prime numbers and the
kj
are the related exponents, by
Gaussian elimination modulo 2 we may hope to nd a relation of the form

jn

j
(
0j
, . . . ,
mj
) (0, . . . , 0) mod 2, (38)
where
j
0, 1. Then, setting
x =

jn
x

j
j
and y = (1)

1
1

2
2
. . .

m
m
, (39)
where

jn

j
(
0j
, . . . ,
mj
) = 2(
0
,
1
, . . . ,
m
), we get x
2
y
2
mod N. This splits
N if x , y mod N. The set of prime numbers which are choosen to nd the
congruences in addition with 1 is called factor basis.
Whereas in the Quadratic sieve factoring algorithm the squares x
2
j
are generated
by the nearest integers to

N the continued fraction algorithm works with the


numerators of the convergents to

N. This idea becomes more transparent in the


algorithm
CFRAC
For j = 0, 1, 2, . . . successively:
1. Compute the j th convergent
p
j
q
j
of the continued fraction expansion to

N.
2. Compute p
2
j
mod N such that the absolute value is
1
2
N. After doing this for
several j , look at the numbers p
2
j
mod N which factor into a product of small primes.
Dene your factor base B to consist of 1, the primes which either occur in more than
one of the p
2
j
mod N or which occur to an even power in just one p
2
j
mod N.
3. List all of the numbers p
2
j
mod N which can be expressed as a product of numbers
in the factor base B. If possible, nd a subset of numbers s of B for which the
exponents according to the numbers in B sum to zero modulo two as in (38), and
dene x, y by (39). If x , y mod N, then (x + y, N) is a non-trivial factor of
N. If this is impossible, then compute more p
2
j
mod N, enlarging the factor base B if
necessary.
43
Once the number of completely factored integers exceeds the size of the factor base,
we can nd a product of them which is a perfect square. With a little luck this
yields a non-trivial factor of our given number (by the observations from above).
The crucial property of the values p
j
is, as we shall show below, that their squares
have small residues modulo N. Otherwise, CFRAC would hinge on the problem of
nding an appropriate factor base B.
Theorem 23 Let > 1 be irrational. Then the convertgents
pn
qn
to satisfy the
inequality
[q
2
n

2
p
2
n
[ < 2.
In particular, if =

N, where N N is not a perfect square, then the residue


p
2
n
mod N which is smallest in absolute value is less than 2

N.
Proof. In view of Theorem 11,
[q
2
n

2
p
2
n
[ = q
2
n


p
n
q
n

+
p
n
q
n

<
q
2
n
q
n
q
n+1
_
+
_
+
1
q
n
q
n+1
__
.
Thus,
[q
2
n

2
p
2
n
[ 2 < 2
_
1 +
q
n
q
n+1
+
1
2q
2
n+1
_
< 2
_
1 +
q
n
+ 1
q
n+1
_
,
which is less or equal to zero, and proves the rst assertion of the theorem. The
claim on p
2
n
mod N is an immediate consequence.
Therefore, the sequence of the numerators of the convergents of

N provides
a sequence of p
j
s whose squares have small residues mod N; in fact, for that one
has even not to calculate the actual convergent in the continued fraction expansion,
only the numerator p
j
modulo N is needed.
We illustrate CFRAC by an example. Consider N = 9073. It is easy to compute
the continued fraction expansion and its convergents:
95
1
,
286
3
,
381
4
,
10192
107
,
20765
218
, . . .

9073 = [95, 3, 1, 26, 2, . . .].


This leads to
j [ 0 1 2 3 4
p
j
mod 9073 [ 95 286 381 1119 2619
p
2
j
mod 9073 [ 48 139 7 87 27
It is not too dicult to factor the small numbers p
2
j
mod 9073. We may choose the
factor base B = 1, 2, 3, 7, so that p
2
j
mod N is a B-number for j = 0, 2, 4. The
corresponding vectors of exponents
j
are
1, 4, 1, 0, 1, 0, 0, 1 and 1, 0, 3, 0.
44
The sum of the rst and the third adds up to zero modulo two, which yields
x = 95 2619 3834 mod 9073 and y = 2
2
3
2
= 36.
It follows that
3834
2
36
2
mod 9073.
Since 3834 , 36 mod 9073, we see that (3834 + 36, 9073) = 43 is a divisor of
9073 (the computation of greatest common divisors is easily done by the euclidean
algorithms backwards). This leads to the prime factorization 9073 = 43 211.
Exercise 26 Use CFRAC to factor the integers N = 17873, 25511.
It is a well-known fact that CFRAC does not work for prime powers N = p
k
, k 2
(which is obvious if k is even but also true for odd exponents k). This is not a big
problem. It is easy to check rather quickly whether a given N is a prime power
or not. However there are other examples for which CFRAC does not work. If the
period of the continued fraction expansion of

N is too short, then the algorithm


can produce only a small factor basis, which reduces the chances for factoring N.
For instance, if N = n
2
+2, then it is easily computed that p
2
j
mod (n
2
+2) = 1 or
= 2 (by hand or with our knowledge on the Pell equation with d = N). This gives
in the CFRAC algorithm (x +y, n
2
+2) = n 1 which does not divide n
2
+2 = N.
A suitable renement can be found in [54].
Having this powerful factoring algorithm in mind, we could be pessimistic about
the security of cryptosystems:
Three may keep a secret, if two of them are dead. Benjamin Franklin
However, in modern cryptosystems prime numbers with approximately one hundred
digits are used, whereas numbers which are the product of two such large primes
cannot be factored in an appropriate time by the most modern factorization algo-
rithms in practice. More details on this can be found in the excellent monography
[10]; for applications of continued fraction algorithms to primality tests we refer to
[7].
11 Liouvilles theorem
A complex number is called transcendental if it is not algebraic, i.e., there
exists no polynomial with rational coecients and root . The use of the word
transcendental dates at least back to Leibniz who wrote in 1704 omnem rationem
transcendunt. First of all, it is not clear whether transcendental numbers exist.
Using Cantors notion of countability we can easily nd an answer.
The height H(P) of a polynomial P with integral coecients is dened to be
the maximum of all coecients of P in absolute value. Let D, H be positive real
45
numbers. Then it is easily seen that there exist only nitely many polynomials
P(X) with integral coecients for which
deg P D and H(P) H.
It follows that the set of polynomials with rational coecients is countable. In
particular, the set of roots of such polynomials is also a countable set. On the other
side, R is uncountable. Thus
Theorem 24 The set of algebraic numbers is countable, and the set of transcen-
dental numbers is uncountable.
This shows that almost all numbers are transcendental, but, actually, we do not
know at least one transcendental number so far. This was one of the major problems
from 19th century mathematics. In 1844 Liouville solved the problem.
Theorem 25 For any algebraic number of degree d > 1 there exists a positive
constant c, depending only on , such that


p
q

>
c
q
d
for all rationals
p
q
.
Proof. Denote by P(X) the minimal polynomial of . Then, by the mean-value
theorem,
P
_
p
q
_
= P()
. .
=0
P
_
p
q
_
=
_

p
q
_
P

()
for some lying in between
p
q
and . Obviously, we may assume that


p
q

< 1.
Then [[ < 1 +[[ and hence [P

()[ <
1
c
for some positive c. It follows that


p
q

> c

P
_
p
q
_

.
Since P(X) is irreducible, it turns out that
p
q
is not a zero of P(X), and hence

q
d
P
_
p
q
_

1,
which proves the theorem.
The proof even provides a value for the constant c in terms of the height. The
height H() of an algebraic number is dened by H() = H(P

), where P

is
the minimal polynomial of .
46
Exercise 27 Prove that one can take
c =
1
d
2
(1 +[[)
d1
H()
in Liouvilles theorem.
Liouvilles theorem shows that algebraic numbers cannot be approximated too
good by rationals. Consequently, a real number which can be better approximated has
to be transcendental! This is easily seen as follows: if we assume that = [a
0
, a
1
, . . .]
is algebraic of degree d, then we have in view of equation (29) and Liouvilles theorem
the inequality
c
q
d
n
<


p
n
q
n

1
a
n+1
q
2
n
,
resp.
ca
n+1
< q
d2
n
, (40)
where the positive constant c depends only on . If d = 2, the case of quadratic
irrationals , we see what we already know, namely that the sequence of partial
denominators of is bounded. But whenever
liminf
n
a
n+1
q
n
= ,
we may nd for any positive innitely many convergents to for which a
n+1
q

n
,
contradicting (40). Hence, such an cannot be algebraic. For instance, the real
number = [1, 10
1!
, 10
2!
, 10
3!
, . . .] is transcendental. However, it is more convenient
to deal with series.
Exercise 28 Show that
=

n=1
10
n!
= 0.11000 10000 . . .
is transcendental.
In Exercise 5 we have proved that is irrational.
A real number is a Liouville number if for each positive integers m there
exist numbers a
m
, b
m
such that
0 <


a
m
b
m

<
1
b
m
m
and b
m
> 1.
In view of Liouvilles theorem it is easy to show, by the same reasoning as above,
that every Liouville number is transcendental. It should be noted that Liouville
wrote in his underlying paper
47
Je crois me souvenir quon trouve un theoreme de ce genre dans une lettre de
Goldbach a Euler; mais je ne sache pas que la demonstration en ait jamais ete
donnee. (cf [8])
Liouville numbers are very interesting. One can prove that Liouville numbers form
an uncountable set of Lebesgue measure zero. On the other side, Erdos [15] showed
that every real number can be written as a sum, resp. a product, of two Liouville
numbers; for this and other related results see [41].
12 The transcendence of e and
The Liouville numbers from the previous paragraph look somehow artical. It is
much more dicult to decide whether a given number is transcendental or algebraic.
The rst and path-breaking result in this direction was proved by Hermite in 1873.
Theorem 26 e is transcendental.
Nivens proof for the irrationality of (see Theorem 2) is based on Hermites proof
for the transcendence of e
Proof. Dene
J(t) =
_
t
0
e
tx
f(x) dx
for t 0, where f is a real polynomial of degree m. Integration by parts shows
that
J(t) = e
tx
f(x)

t
x=0
+
_
t
0
e
tx
f

(x) dx = e
t
m

j=0
f
(j)
(0)
m

j=0
f
(j)
(t). (41)
Denote by F the polynomial obtained from f by replacing each coecient with its
absolute value. Then
[J(t)[
_
t
0
e
tx
[f(x)[ dx te
t
F(t). (42)
Suppose now that e is algebraic, then there exists a polynomial P with integral
coecients a
k
and leading coecient a
d
,= 0 for which
d

k=0
a
k
e
k
= 0.
Without loss of generality we may assume that P is the minimal polynomial of e.
For a large prime p set
f(x) = x
p1
d

k=1
(x k)
p
;
48
the degree of f is equal to m = (d + 1)p 1. We shall consider
:=
d

k=0
a
k
J(k).
In view of (41)
=
d

k=0
a
k
_
_
e
k
m

j=0
f
(j)
(0)
m

j=0
f
(j)
(k)
_
_
=
m

j=0
d

k=0
a
k
f
(j)
(k). (43)
For 1 k d we have f
(j)
(k) = 0 for j < p. Further, if j p, then
f
(j)
(k) =
_
j
p
_
p! g
(jp)
(k) , where g(x) :=
f(x)
(x k)
p
.
Thus, for all j , f
(j)
(k) is an integer divisible by p! . Similarly, f
(j)
(0) = 0 for
j < p 1, and further, for j p 1,
f
(j)
(0) =
_
j
p 1
_
(p 1)! h
(jp+1)
(0) , where h(x) :=
f(x)
x
p1
.
It follows that h
(j)
(0) is an integer divisible by p for j > 0, and h(0) = (1)
dp
(d!)
p
.
Consequently, for j ,= p 1, f
(j)
(0) is also an integer divisible by p! , and f
(p1)
(0)
is an integer divisible by (p 1)! but not by p for p > d. Therefore, let p > d. It
follows that J is a non-zero integer (by the rst identity in (43)) which is divisible
by (p 1)! (by the second identity in (43)). Hence, [[ (p 1)! . On the other
hand, the trivial bound F(k) (2d)
m
implies via the estimate (42)
[[
d

k=0
[a
k
[ke
k
F(k) H(e)d(d + 1)(2d)
(d+1)p1
c
p
,
where c is a constant independent of p. Thus,
(p 1)! [[ c
p
,
which is impossible for suciently large p. This contradiction shows that e cannot
be algebraic.
Following Hermites ideas Lindemann succeeded in 1882 in showing
Theorem 27 is transcendental.
Proof. Suppose that is algebraic. It follows that also := i is algebraic, say of
degree d. Denote the conjugates of by
1
= ,
2
, . . . ,
d
. From Eulers identity
exp(i) = 1 it follows that
(1 + e

1
. .
=0
)(1 + e

2
) . . . (1 + e

d
) = 0.
49
The product on the left hand side can be written as a sum of 2
d
terms e

, where
=
1

1
+ . . . +
d

d
,
and
j
= 0 or 1. Now we assume that precisely n of the numbers are non-zero
and denote them by
1
, . . . ,
n
. It follows that
q + e

1
+ . . . + e
n
= 0 , where q := 2
d
n.
Now we shall proceed as in the previous proof and compare estimates for
] :=
n

k=1
J(
k
),
where J(t) is dened by (41) with
f(x) =
np
x
p1
n

k=1
(x
k
)
p
,
is the leading coeecient of the minimal polynomial of and p again denotes a
large prime number, chosen later. We obtain
] = q
m

j=0
f
(j)
(0)
m

j=0
n

k=1
f
(j)
(
k
),
where m = (n + 1)p 1. Now the sum over k is a symmetric polynomial in

1
, . . . ,
n
with integral coecients. From the fundamental theorem on symmetric
functions in addition with the observation that each elementary symmetric function
in
1
, . . . ,
n
is also an elementary symmetric function in the 2
d
numbers it
follows that the sum over k is an integer. Since f
(j)
(
k
) = 0 for j < p the sum in
question is divisible by p! . Further, f
(j)
(0) is an integer divisible by p! if j ,= p1,
and
f
(p1)
(0) = (p 1)!()
np
(
1
. . .
n
)
p
,
which is an integer divisible by (p 1)! but not by p! when p is suciently large.
Thus, [][ (p 1)! if p > q . But on the other side, (42) implies the upper bound
[][
n

k=1
[
k
[F([
k
[) c
p
for some constant c independent on p. As in the previous proof, this gives the
contradiction. is not algebraic.
As an important consequence of Lindemanns theorem one obtains a negative
answer to an old and famous problem of the ancient Greek: it is impossible to square
a circle by using only ruler and compass. This follows from the obvious fact that
all points capable of construction are dened by intersection of lines and circles,
50
and thus have only algebraic distances one from another; but on the other side,

is transcendental. Actually, Lindemann proved a stronger result, namely: for


any distinct algebraic numbers
1
, . . . ,
n
and any non-vanishing algebraic numbers

1
, . . . ,
n
,

1
e

1
+ . . . +
n
e
n
,= 0.
The transcendence of e follows immediately, the one of by Eulers identity.
Exercise 29 Using Lindemanns theorem prove the transcendence of the trigono-
metric functions sin , cos and tan for all non-zero algebraic .
There are many open problems in transcendental number theory. It is yet un-
proved whether e and are algebraically independent or, to be more concrete, are
e + and e both transcendental?
Exercise 30 Show that at least one of the numbers e + , e is transcendental.
In his seventh problem Hilbert asked for a proof that

is transcendental when-
ever ,= 0, 1 and are algebraic irrationals. In 1934 Gelfond and Schneider
(independently) succeeded in showing this deep result. A generalization was ob-
tained by Alan Bakers celebrated estimates for linear forms in logarithms for which
he was awarded with the Fields medal at the International Congress in Moscow
1966. For instance, he proved the transcendence of
exp(
0
)

1
1
. . .
n
n
and that of any non-vanishing linear form

1
log
1
+ . . . +
n
log
n
,
where the
j
and
j
denote non-zero algebraic numbers. For more details and a
proof see [3], [45] or [8]
13 The theorem of Roth
It is natural to ask for stronger versions of Liouvilles theorem. Only a slight im-
provement would imply the niteness of integral solutions of certain important dio-
phantine equations; a theme to which we shall return in a later section. Suppose
that is an algebraic number of degree d, then we might ask for exponents (d)
such that there exists a positive constant c, depending only on , such that for all
rationals
p
q


p
q

>
c
q
(d)
. (44)
The rst renement to Liouvilles bound (d) = d was made by Thue who showed
in 1909 that (d) >
1
2
d + 1. In 1921 Siegel obtained (d) > 2

d (it should be
noted that Siegel became professor in Frankfurt short after this discovery), and the
famous physiscist Dyson proved in 1947 that (d) >

2d. Finally, Roth proved in


1955 that (d) can be chosen independently of the algebraic degree d.
51
Theorem 28 Let be an algebraic number of degree d 2 and let > 0. Then
there exist only nitely many rational solutions
p
q
to the inequality


p
q

<
1
q
2+
.
This shows that one can take any = (d) > 2 in (44); actually this is equivalent
to Theorem 28. Consequently, algebraic numbers are approximable of order = 2
but not better. In view of Dirichlets approximation theorem Roths theorem is best
possible with respect to the exponent 2. It might be possible that the -quantity
can be sharpened. Lang conjectured that for of degree d 3,


p
q

>
c
q
2
(log q)

has only nitely many solutions if is a constant > 1. Roth got the Fields medal
for his celebrated theorem during the International congress of mathematicians at
Edinburgh in 1958. Unfortunately, Roths theorem is ineective. The method of
proof does not provide an algorithm to determine the set of solutions. However, in
particular cases it is possible to obtain upper bounds for the number of solutions
(but this is a topic to which we will return later). A multidimensional generalization
of Roths theorem is the celebrated subspace theorem of W.M. Schmidt; a nice survey
on this modern eld of research is [43].
We can only sketch the idea of Roths ingenious proof; the technical details are
too complicated but the main ingredients become visible. We will follow [42] but
rst we review Liouvilles idea. The proof of Theorem 25 depends mainly on
the construction of a polynomial P(X) with integral coecients that vanishes
at and for which P
_
p
q
_
is small in terms of


p
q

, and
the fact that P
_
p
q
_
cannot be too small as a function of q .
Thues ingenious contribution was to introduce polynomials in two variables which
provides more freedom in the search for suitable polynomials. Before Roth it was
already known that any progress would demand the use of polynomials in more
than two variables. We start to formulate some properties which a polynomial
should have. Suppose that for 1 n m the rational approximations
pn
qn
to
satisfy


p
n
q
n

<
1
q

n
, (45)
where is some positive real number. Let P(X
1
, . . . , X
m
) be a polynomial with
integral coecients, of degree at most r
n
in X
n
for each n. It follows that

P
_
p
1
q
1
, . . . ,
p
m
q
m
_

1
Q
, where Q := q
r
1
1
. . . q
rm
m
, (46)
52
provided that
P
_
p
1
q
1
, . . . ,
p
m
q
m
_
,= 0. (47)
Let the Taylor expansion of P
_
p
1
q
1
, . . . ,
pm
qm
_
in powers of
pn
qn
be

j
1
. . .

jm
c
j
1
,...,jm
_
p
1
q
1

_
j
1
. . .
_
p
m
q
m

_
jm
.
Suppose now that P has in additon to (47) the properties that for small positive ,

j
1
. . .

jm
[c
j
1
,...,jm
[ < Q

, (48)
and c
j
1
,...,jm
= 0 for all j
1
, . . . , j
m
satisfying
q
j
1
1
. . . q
jm
m
Q

, (49)
where > 0. Taking into account (45) we have for each term in the Taylor series
with a non-zero coecient

p
1
q
1

j
1
. . .

p
m
q
m

jm
<
1
(q
j
1
1
. . . q
jm
m )

1
Q

.
Consequently,

P
_
p
1
q
1
, . . . ,
p
m
q
m
_

<
1
Q

.
In view of (46) it follows that
<
1 +

. (50)
In order to prove Theorem 28 we shall establish the existence of a polynomial P
which satises the conditions (47), (48) and (49) with near to
1
2
as 0. For this
purpose it is necessary to take the number of variables m large and to choose suitable
approximations
p
j
q
j
(actually, taking m = 2 leads only to an estimate < c

d).
In some sense, the question of good lower bounds for rational approximations to
algebraic numbers is transformed to an approximation problem for polynomials in
several variables; the use of many variables allows to put certain restrictions on the
polynomial in question. We shall go a bit more into the details that at least the
estimate > 2 becomes lucid.
The argument is indirect. We suppose that > 2. If there are innitely many
rationals satisfying (45), then we can nd a subsequence
p
1
q
1
, . . . ,
pm
qm
for which
log q
n
log q
n1
>
1

53
with some positive and large q
1
. Then we may choose positive integers r
n
satis-
fying
q
r
1
1
q
rn
n
< q
r
1
(1+

10
)
1
.
Consequently, q
mr
1
1
Q < q
mr
1
(1+)
1
. Then condition (49) takes the form that the
Taylor coecients c
j
1
,...,jm
vanish for all j
1
, . . . , j
m
with
j
1
r
1
+ . . . +
j
m
r
m
< m+ . (51)
The next step is to construct a polynomial P

that satises conditions (48) and (49)


only. This is done following an argument of Siegel which bases on Dirichlets pigeon-
hole principle. We put B = q
r
1
1
and consider the set of polynomials W(X
1
, . . . , X
m
)
of degree at most r
n
in X
n
, and having integral coecients, each of modulus less
than B. We shall try to nd two distinct polynomials W
1
, W
2
such that their deriv-
atives of order j
1
, . . . , j
m
are equal whenever X
1
= . . . = X
m
= , for all j
1
, . . . , j
m
satisfying (51). Any such derivative is of the form
a
0
+ a
1
+ . . . + a
n1

n
,
where the a
j
are integers. It can be shown that the number of possibilities for a
derivative for given j
1
, . . . , j
m
is < B
n(1+3)
, whereas the number of polynomials W
is about B
r
, where r = (r
1
+1) . . . (r
m
+1). Hence, the number of possible distinct
sets of derivatives provided that the number of sets j
1
, . . . , j
m
satisfying (51), and
with no j
k
exceeding the corresponding r
k
, is less than about
r
n(1+3)
. The number
of integer points (j
1
, . . . , j
m
) in the region dened by the above conditions can be
shown to be less that
2r
3n
if is chosen by
m+ =
1
2
m3nm
1
2
.
A more detailed analysis shows that we may assume that tends with to zero as
m . Thus we obtain
=
1
2


m

3n
m
1
2
.
This shows that the number of variables m has to be chosen large to give a value of
nearby 2 in (50). This choice yields the existence of such polynomials W
1
, W
2
.
Now the polynomial P

= W
1
W
2
satises condition (49), and it can be shown
by some estimation process that it also satises (48). We cannot expect that P

satises also condition (47). Actually, the construction of a polynomial P out of P

which satises (47) is the most dicult aspect in Roths theorem (Roths lemma).
Unfortunately, this is beyond the scope of this course; for details see the excellent
[44], [45] and [60]. The nal contradiction in Roths proof is that for suciently
large m we can nd a as close to
1
2
as we please for which the right hand side of
(50) exceeds any given > 2.
There is an interesting correspondence to complex analysis due to Vojta. It is
conjectured that the exponent 2 in Roths theorem is the same number 2 which
54
occurs in the defect relation in Nevanlinnas value-distribution theory or, roughly
speaking, there seems to be a bridge between the approximation theories for numbers
and for functions; for details see [59].
Roths theorem has many important implications. For instance, following the
lines of Liouvilles proof of the transcendence of Liouville numbers, one can obtain
a transcendence criterion for continued fractions.
Exercise 31 Let > 0 and suppose that = [a
0
, a
1
, . . .] is an irrational number
which has an innity of partial quotients satisfying a
n+1
q

n
, where q
n
is the
denominator of the n-th convergent to . Prove that is transcendental.
Another interesting consequence is Everetts attack on Fermats last theorem.
The main result of [16] states that if p is a xed odd prime and is a xed positive
integer, then there are at most nitely many pairs of coprime integers x, y on the
line y = x+ such that x
p
+y
p
is the p-th power of an integer, whereas in the case
p = 2 there are an innitude of such pairs x, y for which x
2
+ y
2
is a square; the
rst result follows tricky but elementary from Theorem 28, and the second one from
the theory of Pell equations. The most important implication of Roths theorem is
to nd in the theory of cubic curves.
14 Thue equations
During his stay in London the ingenious Indian mathematician Ramanujan spend
several weeks in hospital. Once his colleague Hardy came to visit and remarked
that he had come in taxicab number 1729, which is surely a rather dull number.
Ramanujan replied immediately that this is not true; 1729 is rather interesting since
it is the smallest integer expressible as a sum of two cubes in two dierent ways:
1729 = 1
3
+ 12
3
= 9
3
+ 10
3
.
This can be seen as follows. Since
(X + Y ) (X
2
XY + Y
2
) = X
3
+ Y
3
= 1729 = 7 13 19,
we have to consider all possible factorizations 1729 = A B and solve
A = X + Y and B = X
2
XY + Y
2
. (52)
The substitution Y = A X leads to the quadratic equation
3X
2
3AX + A
2
B = 0,
so that we only have to check if
1
6
(3A

12B 3A
2
)
is an integer. This yields the two pairs A = 13, B = 133 and A = 19, 91 which
correspond to the above given representations. We leave it to the reader to check
that all integers m < 1729 have at most one representation as a sum of two cubes.
55
Exercise 32 Find all integral solutions to the equation X
3
+ Y
3
= 403. What can
you say about the set of solutions of X
3
Y
3
= m with m = 0 and m = 1729,
respectively?
It is one thing to know that a certain diophantine equation has only nitely many
solutions. However, for a complete list of all solutions one needs to have knowledge
on their size.
Theorem 29 Let m be a positive integer. Then every solution to the equation
X
3
+ Y
3
= m (53)
in integers x, y satises the inequality
max[x[, [y[ 2
_
m
3
.
Proof. By the above reasoning each integral solution x, y satises (52) for some
factorization m = A B. Hence
m [B[ = [x
2
xy + y
2
[ =
3
4
x
2
+
_
1
2
x y
_
2

3
4
x
2
,
which gives the bound for x; since the equation is symmetric with respect to X and
Y the same estimate holds for y as well. This proves the theorem.
There is a big dierence between the sets of rational numbers and of integers with
respect to diophantine equations. We have shown that equation (53) has only nitely
many integral solutions. On the contrary it can be shown that, for example, the
equation
X
3
+ Y
3
= 9
has innitely many rational solutions. The transformation
X
12
X + Y
and Y 12
X Y
X + Y
provides a one-to-one correspondence between the rational solutions of the previous
equation and the rational points on the curve
Y
2
= X
3
48;
the latter equation denes a so-called elliptic curve (this is not an ellipse but
it is related to such objects). The theory of elliptic curves tells us that there are
innitely many rational solutions. The interested reader can nd more details in
the excellent monography [51]. A more simple but from a geometric point of view
dierent example is the unit circle.
Analyzing (53) was rather simple since our diophantine equation was related to
a reducible polynomial. If we have instead an irreducible polynomial, as for example
X
3
+2Y
3
= m, the situation is much more dicult. The rst remarkable result for
such equations was found by Thue in 1909.
56
Theorem 30 Let a, b, c be non-zero integers. Then the equation
aX
3
+ bY
3
= c (54)
has only nitely many solutions in integers.
Proof. If x, y Z is a solution to (54) then X = ax, Y = y is a solution to
X
3
+ a
2
bY
3
= a
2
c. Thus it is enough to prove the theorem with a = 1. Further,
since we may replace y by y and/or b by b in (54) if necessary, it is sucient
to consider the equation
X
3
bY
3
= c, (55)
where b, c are positive integers. The factorization method which we used in the
proof of Theorem 29 worked very well. This observation motivates to have a look
on the factorization
X
3
bY
3
= (X Y ) (X
2
+ XY +
2
Y
2
) with =
3

3. (56)
If b is a perfect cube , the latter identity is a factorization over the integers and
we can factor c and proceed as above. If b is not a perfect cube, we need a dierent
idea. Then we argue as Euler did in case of the Pell equation. If x, y N is a large
solution of (55), then the rst factor on the right hand side of (56) must have small
modulus. This follows from the estimate
x
2
+ xy +
2
y
2
=
_
x +
1
2
y
_
2
+
3
4
(y)
2

3
4

2
y
2
, (57)
which implies via (56)
c = x
3
by
3
= [x y[ [x
2
+ xy +
2
y
2
[
3
4

2
y
2
[x y[.
This gives


x
y

4c
3
2
y
3
. (58)
In view of Roths theorem it follows that the latter inequality can have only nitely
many solutions
x
y
which implies the assertion of the theorem.
Note that also Thues estimate (3) >
5
2
in (44) is sucient as well (Thue did
not have Roths bound) but not Liouvilles (3) = 3. Theorem 30 can easily be
extended to equations of higher degree, so-called Thue equations.
Exercise 33 Let P(X, Y ) be an irreducible binary form with integral coecients of
degree at least three and let m be any integer. Show that the equation P(X, Y ) = m
has only nitely many solutions in integers. [Hint: instead of (57) work with a mean-value
argument as in the proof of Liouvilles theorem.]
57
In view of Exercise 32 the condition on the irreducibility of P in the exercise above
is necessary.
The approach via diophantine approximations has its limitations. Although
the set of integral solutions is nite Thues argument does not enable to give a
complete list of solutions or indeed to determine whether the equation is soluble (we
shall discuss this question in the following section). This follows from the fact that
Roths theorem is ineective. In particular cases - here we do not mean trivial cases
as settled in Theorem 29 - one can succeed by Bakers estimates for linear forms in
logarithms. For example, Baker [2] obtained the estimate

p
q

3

10
6
q
2.9955
,
valid for all rationals
p
q
. The exponent is larger than Roths exponent 2 + but
the implicit constant is absolute. Combining this with (58) shows that any integer
solution x, y to
X
3
2Y
3
= c,
where c is a positive integer, satises the estimate
[y[ 10
1317
c
223
.
This is a large bound, but at least it only grows like a power of c.
Meanwhile more than Thues theroem is known. The more general equation
aX
3
+ bX
2
Y + cXY
2
+ dY
3
+ eX
2
+ fXY + gY
2
+ hX + iY + j = 0
describes a cubic curve. In his PhD thesis Siegel proved that a non-singular cubic
curve with integral coecients has only nitely many points with integral coordinates;
geometrically speaking, a point is called singular if it is a double point. It is easily
seen that linear or quadratic equations can have an innitude of solutions in integers
(e.g. Pells equation). Furthermore, by Faltings celebrated proof of the Mordell
conjecture we know that any curve of genus 2 has only nitely many rational
points. Note that Faltings was awarded with a Fields medal at the International
Congress of mathematicians in Berkeley 1986, following the footprints of Roth and
Baker. We do not explain here what the topological notion genus means but note
that this extends the class of homogeneous equations in two variables which Thue
could handle and covers, for example, Fermats last theorem (1). For a nicely written
overview see the last chapter of [25].
15 The abc-conjecture
There is no complete theory for the variety of diophantine equations known. And it
even seems that there exists nothing like that. In his tenth problem Hilbert asked if
there exists a universal algorithm which can determine whether a given polynomial
58
diophantine equation with integral coecients has a solution in integers. In 1970
Matjasevic gave a negative answer; for the most simple proof see [26]. Besides
he proved that the set of prime numbers is diophantine, where a set / is called
diophantine if there exists a polynomial P(X
1
, . . . , X
n
) with integral coecients
such that the equation P(X
1
, . . . , X
n1
, a) = 0 has integral solutions if and only if
a /. On the other side, Putnam showed that a set is diophantine if and only
if it coincides with the set of positive values of a suitable polynomial taken at the
nonnegative integers. Consequently, there exists a polynomial g(X
1
, . . . , X
n
) whose
positive values at integers x
j
0 are primes, and every prime can be represented this
way. Surprisingly, it is even possible to nd such polynomials; here is an example
due to Jones, Sato, Wada and Wiens of degree 25 in the 26 variables a, b, c, . . . , z :
(k + 2)
_
1 [wz + h + j q]
2
[(gk + 2g + k + 1)(h + j) + h z]
2
[2n + p + q + z e]
2
[16(k + 1)
3
(k + 2)(n 1)
2
+ 1 f
2
]
2
[e
3
(e + 2)(a + 1)
2
+ 1 o
2
]
2
[(a
2
1)y
2
+ 1 x
2
]
2
[16r
2
y
4
(a
2
1) + 1 u
2
]
2
[((a + u
2
(u
2
a))
2
1)(n + 4dy)
2
+1 (x cu)
2
]
2
[n + + v y]
2
[(a
2
1)
2
+ 1 m
2
]
2
[ai + k + 1 i]
2
[p + (a n 1) + b(2an + 2a n
2
2n 2) m]
2
[q + y(a p 1) + s(2ap + 2a p
2
2p 2) x]
2
[z + p(a p) + t(2ap p
2
1) pm]
2
_
(cf. [40]). It is not known which is the minimum possible number of variables is
(clearly, it is larger than one), and it is also not known what the minimal degree is.
With regard to Godels incompleteness theorem we nally note another remarkable
consequence: for any given axiomization there exists a diophantine equation without
integral solutions (probably in three variables only) which cannot be proved in this
axiomatic setting.
Diophantine equations are a rather dicult topic. Of course, in special cases one
might possibly succeed, sometimes even with elementary arguments.
Exercise 34 Prove that x = y = z = 0 is the only integral solution of the equation
X
3
+ 2Y
3
+ 4Z
3
34XY Z = 0.
[Hint: use Fermats method of descent, i.e. construct to a given (minimal) solution in positive
integers a smaller one.]
In particular cases it is much easier if we ask for a diophantine theory for poly-
nomial equations (very often in number theory the function eld cases are better
to handle). Let n(P) denote the number of distinct complex zeros of a polynomial
P (which does not vanish identically). Recall that polynomials are said to be co-
prime if they do not share any of their linear factors, resp. if they have distinct
59
roots. Stothers [58] and Mason [33] (independently) proved in the eighties of the
last century the following surprising result.
Theorem 31 Let a, b, c be coprime polynomials over C, not all constant. If
a + b = c,
then
maxdeg a, deg b, deg c < n(abc).
Proof. In view of the identity between the polynomials they are pairwise coprime,
i.e., their zeros are pairwise distinct. By the fundamental theorem of algebra let
a = A
da

j=1
(X
j
)
a
j
, b = B
d
b

k=1
(X
k
)
b
k
and c = C
dc

l=1
(X
l
)
c
l
,
where A, B, C are complex numbers, d
a
, d
b
, d
c
and a
j
, b
k
, c
l
are nonnegative inte-
gers, and the
j
,
k
,
l
are the distinct complex zeros of a, b and c, respectively. For
later use we note the partial fraction decomposition of the logarithmic derivative
(log a)

=
da

j=1
a
j
X
j
,
where the derivative is taken with respect to X; obviously, analogous formulas hold
for b and c, respectively. Without loss of generality we may assume that a has
maximal degree; clearly, deg a 1. Setting F = a/c and G = b/c we obtain
F + G1 = 0. This implies F

+ G

= 0 and
a
b
=
F
G
=
G

/G
F

/F
.
We compute
F

F
= (log F)

= (log a)

(log c)

=
da

j=1
a
j
X
j

dc

l=1
c
l
X
l
,
and a similar formula holds for G

/G as well. This yields


a
b
=
d
b

k=1
b
k
X
k

dc

l=1
c
l
X
l
da

j=1
a
j
X
j

dc

l=1
c
l
X
l
.
Multiplying both the denominator and the numerator with
/ :=
da

j=1
(X
j
)
d
b

k=1
(X
k
)
dc

l=1
(X
l
),
60
leads to the representation
/
B
=
a /
b /
=
a
b
,
where /, B are polynomials, both having degree < d
a
+d
b
+d
c
= n(abc) (termwise).
Since a and b are coprime, and since polynomials over the complex numbers factor
uniquely into irreducible (even in linear) factors, it follows that deg a < n(abc). The
theorem is proved.
A slightly alternative proof relies on the Riemann-Hurwitz formula from Algebraic
geometry. This idea of Silverman is outlined in [19] (which is also an excellent survey
on the abc-conjecture). An immediate and interesting consequence of Theorem 31
is Fermats last theorem for polynomials. Let n > 2 and suppose that we have
polynomials x, y, z , not all constant, with
x
n
+ y
n
= z
n
.
Without loss of generality we may assume that x, y, z are coprime and that x has
maximal degree. Put a = x
n
, b = y
n
and c = z
n
. Since the distinct zeros of
abc = (xyz)
n
coincide with the distinct zeros of xyz , application of Theorem 31
yields
ndeg x = deg a < n(abc) = n(xyz) deg xyz 3 deg x.
This proves that all polynomial solutions of the Fermat equation (1) with exponent
n 3 are trivial, which here means that they are equal up to constant factors. The
condition on the exponent is necessary as the example
(2X)
2
+ (X
2
1)
2
= (X
2
+ 1)
2
shows; this should be compared with Exercise 1. It is remarkable that Fermats
last theorem for polynomials coincides with Fermats last theorem for integers with
respect to the exponents! What can we learn out of this similarity?
For another interesting application recall that, given a polynomial D, it is un-
known how to determine whether the Pell equation P
2
DQ
2
= 1 has polynomial
solutions P, Q or not (see section 9). Taking into account Theorem 31 it can be
shown that if the number of distinct zeros of D is less than
1
2
deg D, then the
equation P
2
D Q
2
= 1is unsolvable. For this and more see [56].
One approach to solve the lack of having a general approach towards a theory of
diophantine equations is posing a general conjecture. First attempts were made by
Oesterle and in rened form by Masser soon after Theorem 31 appeared. Inspired
by the above example of Fermats last theorem one has to think about the correct
translation of Theorem 31. By the fundamental theorem of algebra each polynomial
over C splits into linear factors. So the degree of a polynomial is nothing else than
the number of linear factors, and the distinct zeros correspond one-to-one to the
distinct linear factors. The size of a number is measured by the absolute value and
the irreducible factors of an integer are its prime factors. This dictionary is simple
61
and sucient. The most common version of the abc-conjecture states that if a, b, c
are coprime integers which satisfy
a + b = c,
then, for any > 0,
max[a[, [b[, [c[ C()
_
_

p|abc
p
_
_
1+
,
where C() is a constant depending only on . The appearing product is called
squarefree kernel. Unfortunately, the appearance of the and the related im-
plicit constant are necessary as we shall explain now. Let m be a positive integer.
Fermats little theorem from elementary number theory states that if a and n and
are coprime integers, then
a
(n)
1 mod n, (59)
where (n) is the order of the group of prime residue classes mod n (Z/pZ)

.This
implies
2
23
m1
= 2
(3
m
)
1 mod 3
m
,
where . Hence, there exists an integer k for which
1 + 3
m
k = 4
3
m1
.
Setting a = 1, b = 3
m
k and c = 4
3
m1
, it is immedately seen that
/ :=

p|abc
p = 2 3

p|k
p=2,3
p 6k.
It follows that
b
/
3
m2
,
where the right hand side tends with m to innity, whereas
b
K
1+
tends to zero if /
is approximately of order k (which seems reasonable). In view of the rather general
equation a+b = c we have a plenty of interesting examples which all t surprisingly
well to the current state of knowledge on diophantine equations.
Theorem 32 The abc-conjecture implies that there exist at most nitely many non-
trivial solutions in integers of the Fermat equation
X
n
+ Y
n
= Z
n
with n 4.
Proof. If x
n
+ y
n
= z
n
with coprime integers, then put a = x
n
, b = y
n
and c = z
n
in the abc-conjecture. Without loss of generality we may assume that x, y and z
62
are positive integers. Since each prime factor of abc = (xyz)
n
divides xyz , their
product is xyz < z
3
. Hence the abc-conjecture implies
z
n
C()z
3(1+)
.
In particular, if we choose =
1
6
we get the estimate z
n
7
2
C(
1
6
) which leaves
only nitely many possibilities for z if n 4. This is the assertion.
The proof suggests that the size of the exponents and the number of variables in
diophantine equations (like the Fermat equation or also the Pell equation) seem to
be crucial when we ask for solution in integers! Another example is the equation
X
m
Y
n
= 1. (60)
Recently, Mihailescu [34] proved the Catalan conjecture which claims that the
only solution of the latter equation in integers x, y, m, n 2 is given by 3
2
2
3
= 1.
It should be noted that Tijdeman proved via Bakers bounds for linear forms that
there are only nitely many solutions.
Exercise 35 Show that the abc-conjecture implies the niteness of the set of solu-
tions of (60) in integers whenever m, n 2. What can be said about polynomial
solutions?
But the abc-conjecture can even do much more. Elkie [13] showed that abc implies
Mordells conjecture and Bombieri [6] deduced Roths theorem; Frankenhuysen [17]
unied both results and conjectured a renement of Roths statement (with respect
to the appearance of in the eponent). Seeing how powerful the abc-conjecture
is, we might be sceptic about its truth. On the contrary it produces exactly the
results which are already proved or which are conjectured by a dierent reasoning.
A proof of the abc-conjecture seems to be out of reach so far. There is some hope to
generalize Wiles proof of Fermats last theorem (which bases on an ingenious idea
of Frey) along the theory of modular forms and Galois representations. Another
attempt are again Bakers bounds for linear forms, but both approaches are not
capable without new ideas. The best unconditional result (up to the appearing )
is the estimate
max[a[, [b[, [c[ exp
_
_
C

p|abc
p
1
3
+
_
_
,
where the constant C is an eectively computable positive constant, due to Stewart
and Kunrui [57] (using contributions by Waldschmidt and Tijdeman in advance).
16 p-adic numbers
p-adic numbers were discovered (or created, but this is a philosophical question)
about hundred years ago by Hensel [22] (who was professor in Marburg). Meanwhile
63
the theory of p-adic numbers has a plenty of applications and impacts in various
mathematical elds but its origin lies in the theory of diophantine equations. We
follow the monography [18] as well as Neukirchs survey in [12] and [55].
By the unique prime factorization of integers every rational number ,= 0 has
a representation
=

p
p
(;p)
with (; p) Z;
here the product is taken over all prime numbers p, but in fact only nitely many
of the p-exponents (; p) of are non-zero. Thus, if we x a prime p, then we
may write
=
a
b
p
(;p)
with 0 ,= a, b Z, p ,[ab.
Here and in the sequel p denotes always a prime number or the symbol which
we will explain later. We dene the p-adic absolute value on Q by setting
[[
p
=
_
p
(;p)
if ,= 0,
0 if = 0;
the function (; p) is called p-adic valuation. An absolute value on a
eld K is a function [ [ : K R satisfying the axioms
[x[ 0 for all x K, and [x[ = 0 if and only if x = 0;
[x y[ = [x[ [y[ for all x, y K;
[x + y[ [x[ +[y[ for all x, y K.
If the last axiom can be replaced by
[x + y[ max[x[, [y[ for all x, y K, (61)
the absolute value is said to be non-archimedean; otherwise the absolute value is
called archimedean. The well-known absolute value
[[

=
_
if 0,
if < 0,
is the standard example of an archimedean absolute value on K; the notation [ [

is traditional in the context of p-adic numbers. An example of a non-archimedean


absolute value is the trivial absolute value which is constant 1 on all non-zero
elements, but this is boring. More interesting are p-adic absolute values. It is easy
to check that [ [
p
is indeed an absolute value, and it is not much more dicult
to prove that even (61) is satised; the main idea for the proof can be found by
studying the following example
[3 5 + 2 3
2
[
3
= [3(5 + 2 3)[
3
=
1
3
= max[3 5[
3
, [2 3
2
[
3
.
64
Exercise 36 Prove that the p-adic absolute is a non-archimedean absolute value.
Show that an absolute value [ [ on a eld K is non-archimedean if and only if
sup[n[ : n N < ; (62)
Remark: note that, since any positive integer n has a representation n = 1 + . . . + 1, N has a
natural embedding into any eld K.
Taking into account characterization (62) we see that the origin of the notion of an
archimedean absolute value is caused by Archimedes lemma which states that for
all non-zero integers, resp. rationals x, y there exists a positive integer m with
[mx[

> [y[

.
An absolute value on a eld K induces a topology. Since an absolute value [ [ is a
norm, we may dene a metric by setting
d(x, y) = [x y[ for x, y K.
Now we can measure distances on K. If [ [ is a non-archimedean absolute value, then
we call the corresponding metric ultrametric and K together with this ultrametric
is said to be an ultrametric space. Obviously, with view to (61), a metric is
ultrametric if and only if
d(x, y) max d(x, z), d(y, z) for all x, y, z K
holds; the latter inequality is called the ultrametric inequality. We note that a
rational number has a small p-adic absolute value if and only if is divisible by
a large power of p. This was the underlying idea for Hensel in introducing p-adic
numbers. Divisibility properties of the integers are the fundamentals in number
theory!
We call two absolute values equivalent if they induce the same topology. It is
natural to ask what kind of absolute values a given eld has. In 1918 Ostrowski gave
a full description of the absolute values of the eld of rational numbers; he proved
that every non-trivial absolute value on Q is equivalent to one of the absolute values
[ [
p
, where either p is a prime number or p = . This ts perfectly to the product
formula

p
[[
p
= 1 for any 0 ,= Q;
the latter identity follows easily from the denition of the p-adic absolute value and
the unique prime factorization. The primes p are also called places and p =
stands for the innite place.
p-adic absolute values imply a curious convergence. Consider the linear equation
X = pX + 1.
65
It is a simple task to nd the solution by separating all X-terms. But we shall try
something completely dierent. The iteration
x
0
:= 1 , x
n
:= px
n1
+ 1 for n = 1, 2, 3, . . .
leads to the sequence
x
n
= 1 + p + p
2
+ . . . + p
n
=
1 p
n+1
1 p
.
Since

p
n+1
1 p

p
= [p
n+1
[
p
[1 p[
1
p
= p
n1
tends to zero as n , we see that the sequence (x
n
) is p-adically convergent.
Moreover, we obtain a strange formula for the geometric series

k=0
p
k
=
1
1 p
, (63)
which is with respect to the usual absolute value divergent! Actually, this is the
solution of the equation in question (since the solution, resp. the limit, is rational
we are out of any trouble). However, the same reasoning as for (63) can be applied
to other series as well:
=

k
a
k
p
k
with
k
Z, 0 a
k
< p, (64)
where Z (but = is forbidden), are p-adically convergent. Moreover, the
sequences of their partial sums are all Cauchy sequences. This follows immediately
from the following assertion. A sequence of rational numbers (
n
) is a Cauchy
sequence with respect to a p-adic absolute value [ [
p
if and only if
lim
n
[
n+1

n
[
p
= 0. (65)
This gives a rst glimpse on p-adic analysis and shows how much it diers from real
analysis. To prove (65) we write m = n +r > n, and note with regard to (61) that
[
m

n
[
p
= [
n+r

n+r1
+
n+r1
. .
=0

n+r2
. . . +
n+1

n
[
p
max[
n+r

n+r1
[
p
, . . . , [
n+1

n+r1
[
p
.
Similarly to the decimal fraction expansion we have the following irrationality
criterion:
Exercise 37 Show that the series (64) has a rational limit if and only if the sequence
of the ciphers a
k
is eventually periodic.
66
We know that Q is not complete with respect to the usual archimedean absolute
value (e.g., Theorem 1). Actually, by the latter exercise the same is true if we take
any p-adic absolute value. Thus the eld of rational numbers is not complete to
any of its non-trivial absolute values. To get out of this trouble we may complete
Q with respect to a p-adic absolute value, analogously to Cantors construction of
the real numbers as a completion of Q with respect to [ [
p
. Denote by (
p
the set
of all p-adic Cauchy sequences (
n
). We dene addition and multiplication by
(
n
) + (
n
) = (
n
+
n
) and (
n
) (
n
) = (
n

n
).
This makes (
p
to a commutative ring. The subset /
p
consisting of all Cauchy
sequences (
n
) with lim
n

n
= 0 is an ideal (thats clear). In fact, /
p
is even
a maximal ideal which can be seen as follows.
Suppose we have an arbitrary Cauchy sequence (
n
) (
p
/
p
, i.e.,
lim
n

n
,= 0. Then there exist a constant c and an integer N such that
[
n
[
p
c > 0 for n N.
Dene (
n
) by setting
n
= 0 if n < N, and
n
=
1
n
otherwise. For n N,
[
n+1

n
[
p
=

n+1

p
=

n+1

n+1

1
c
2
[
n+1

n
[
p
.
Hence, it follows from (65) that (
n
) is a Cauchy sequence since (
n
) is one. Note
that lim
n

n

n
= 1, and thus
(1) (
n
) (
n
) /
p
.
This shows that the ideal which is generated by (
n
) and /
p
is equal to the ideal
generated by (1), but that is the whole ring (
p
. This shows that /
p
is maximal.
From algebra we know that the quotient of a commutative ring by its maximal
ideal is a eld.
Theorem 33 Q
p
:= (
p
//
p
is a eld, the eld of p-adic numbers.
We can embed Q via the mapping (, , . . .) in a natural way into Q
p
, and
thus Q
p
is the completion of Q with respect to [ [
p
. Obviously, Q lies dense in
Q
p
. The p-adic absolute value can be continued from Q onto Q
p
. In view of
Ostrowskis theorem, the non-equivalent non-trivial absolute values on Q lead to a
family of elds lying above Q:
Q
completion
Q
2
, Q
3
, Q
5
, and Q

= R.
Nearby the world of real numbers exist - with the same right - for each prime number
p the world of p-adic numbers.
67
But how do p-adic numbers look like? Actually, we know p-adic numbers already.
Recall the series representations (64). By (65) these series are the limits of Cauchy
sequences of rational numbers. We may identify them with elements in Q
p
, i.e., each
series (64) denes a p-adic number , and any p-adic number has a representation
of the form (64). The rst assertion is clear, the second one can be shown by an
approximation argument as follows. Firts, suppose that is a p-adic number with
(; p) 0 and n is a positive integer. Since Q is dense in Q
p
we can nd a
rational number
x
y
such that


x
y

p
p
n
.
In view of (61)

x
y

p
=

x
y
+
. .
=0

p
max
_
_
_
[[
p
,


x
y

p
_
_
_
1.
Consequently, p does not divide y. Hence there exists an integer Y
n
such that
yY
n
1 mod p
n
. Moreover, by (8) we may assume that 0 xY
n
p
n
1. This
implies

x
y
xY
n

p
=

x xyY
n
y

p
p
n
.
Put
n
= xY
n
. Then, with regard to the above estimates,
[
n
[
p
=

x
y
+
x
y
. .
=0

p
max
_
_
_


x
y

p
,

x
y

n

p
_
_
_
p
n
.
By induction it follows that for any such there exists a Cauchy sequence of integers
(
n
) converging to for which
0
n
p
n
1 and
n

n1
mod p
n1
.
This implies the existence of a sequence of integers a
k
with

n
= a
0
+ a
1
p + . . . + a
n
p
n
and 0 a
k
p 1, (66)
which leads to a representation of of the form (64); the general case is deduced
form the one above by considering p
(;p)
instead of .
Hensel introduced p-adic numbers in an ad hoc manner via (64). His idea was
to transport the powerful method of power series from analysis to number theory.
This direct approach is of advantage for explicit computations.
Exercise 38 Calculate the 3-adic expansions of
5
9
,
9
5
,
5
9
+
9
5
and
5
9

9
5
. What is the
p-adic value of

k=0
(p 1)p
k
?
68
We shall give a second, purely algebraic construction of Q
p
which we will use
later on. Denote by Z/p
n
Z the ring of residue classes mod p
n
for n N. Then we
may dene the sequence of maps
. . . Z/p
n+1
Z Z/p
n
Z . . . Z/p
2
Z Z/pZ,
where each map is the natural projection

n
: Z/p
n+1
Z Z/p
n
Z , x x mod p
n
.
Now we can dene the projective limit
lim

Z/p
n
Z =
_
_
_
(x
n
)

n1
Z/p
n
Z : (x
n+1
) = x
n
_
_
_
.
Being a formal product of rings the projective limit inherits the ring structure of
its factors. As a matter of fact we see that each (x
n
) lim

Z/p
n
Z gives raise to
a Cauchy sequence of integers
n
(with respect to the p-adic absolute value) such
that
x
n

n
mod p
n
,
resp. a sequence of integers (a
k
) satisfying (66). It follows that
lim

Z/p
n
Z

=
_

k=0
a
k
p
k
: 0 a
k
p 1
_
. (67)
Denote the right hand side above by Z
p
. With the projective limit also Z
p
is a
commutative ring. Now we can dene Q
p
as the fraction eld of Z
p
. According to
this construction we say that a p-adic number with [[
p
1, resp. (; p) 0,
is a p-adic integer, and Z
p
becomes the ring of p-adic integers.
17 The Local-global principle
The p-adic way of thinking gives a new strategy to attack diophantine equations.
Consider the equation
P(X
1
, X
2
, . . . , X
r
) = 0, (68)
where P is a polynomial in several variables with integral coecients. We are
interested in the solubility in integers. We can weaken this dicult problem by
replacing (68) by the system of congruences
P(X
1
, X
2
, . . . , X
r
) 0 mod m,
where m runs through all positive integers, or equivalently, by the chinese remainder
theorem,
P(X
1
, X
2
, . . . , X
r
) 0 mod p
n
with n = 1, 2, . . . , (69)
69
where p runs through all prime numbers. This approach yields sometimes certain
information about the original equation. We shall illustrate this by an interesting
example. We ask for the integral solutions of
Y
2
= X
3
+ 7.
By Exercise 33 we know that there are at most nitely many solutions in integers.
We may rewrite the equation in question as
Y
2
+ 1 = (X + 2)((X 1)
2
+ 3).
If x, y Z is a solution, then (x 1)
2
+ 3 3 mod 4. Hence there is a prime
p 3 mod 4 dividing it, and reducing the equation in question modulo p shows
that 1 is a square mod p which gives a contradiction. Thus, there are no integer
solutions to the equation in question.
Exercise 39 Show that the Pellian minus-equation
Y
2
dY
2
= 1
has no integral solutions if d 3 mod 4.
Up till now we made out of one equation innitely many congruences. But we can
go further. It is a very astonishing and useful fact that all these unpleasantly many
congruences (69) for a xed prime can be summed up to one p-adic equation.
Theorem 34 With the conditions on P from above, the system of congruences (69)
is solvable if and only if the equation (68) has a solution in p-adic integers.
Proof. Via the projective limit (67) we have
Z
p

= lim

Z/p
n
Z

n=1
Z/p
n
Z.
The equation (68) splits over the ring on the right hand side into the system of
congruences (69). Thus, each p-adic solution of (68) leads also to a solution of (69).
Conversely, for any positive integer n, let (x
(n)
1
, x
(n)
2
, . . . , x
(n)
r
) be a solution of
P(X
1
, X
2
, . . . , X
r
) 0 mod p
n
.
If all elements
(x
(n)
1
)
n
, (x
(n)
2
)
n
, . . . , (x
(n)
r
)
n

n=1
Z/p
n
Z
lie in lim

Z/p
n
Z = Z
p
, we are ready. Otherwise, we have to construct a subse-
quence with this property. We only consider the case r = 1 and write x instead of
x
1
; the general case can be treated similarly. Since Z/pZ is nite there are innitely
70
many terms of x
(n)
which are congruent to some xed y
1
Z/pZ. Thus we can
nd a subsequence x
(n)
1
of x
(n)
for which
x
(n)
1
y
1
and P
_
x
(n)
1
_
0 mod p.
Obviously, we can continue and obtain, for any k 2, a subsequence x
(n)
k
of
x
(n)
k1
such that
x
(n)
k
y
k
and P
_
x
(n)
k
_
0 mod p
k
,
where the y
k
Z/p
k
Z are related by
y
k
y
k1
mod p
k1
.
The y
k
dene a p-adic integer y = (y
k
)
k
lim

Z/p
k
Z

= Z
p
satisfying
P(y
k
) 0 mod p
k
for every k N.
By continuity this implies P(y) = 0. The theorem is proved.
If the polynomial P is homogeneous, the equation (68) has always the trivial solution
x
1
= . . . = x
r
= 0. In this context it is more natural to ask for non-trivial solutions,
i.e., solutions where not all x
j
equal zero. In this case a little variation of the proof
above gives the corresponding statement for a non-trivial p-adic solution.
We shall give an example for Theorem 34. Consider the congruences
X
2
2 mod 7
n
(n = 1, 2, . . .). (70)
Since 2 is a quadratic residue mod 7 the congruence is solvable for n = 1. Indeed,
we nd thesolutions 3 mod 7. We start with x
1
+3 mod 7. Now let n = 2.
Any solution x
2
of (70) with n = 2 satises also (70) with n = 1. We do the ansatz
x
2
3 + 7z which leads in (70) to
2 (3 + 7z)
2
= 9 + 6z 7 + 7
2
z
2
2 + (1 + 6z) 7 mod 7
2
,
resp.
0 1 + 6z mod 7.
Thus, z 1 mod 7 and we obtain x
2
3 +1 7 mod 7
2
. Since in any further step
we have to solve only linear congruences mod 7, this process continues ad innitum
and leads to
x = 3 + 1 7 + 2 7
2
+ . . . ,
which is a 7-adic solution of the equation X
2
= 2; we may write x =

2 but notice
that this is not the square root of 2 in the eld of real numbers. The other solution
can be found the same way starting with x
1
3 mod 7. There is another aspect
in this example which is of certain interest. The crucial step above was the fact
that 2 is a quadratic residue mod 7, linear congruences are always solvable. This
observation shows that a positive integer is a square in Z
p
if and only if is a
quadratic residue mod p. Furthermore, we have
71
Theorem 35 A rational number is a square if and only if it is a square in all
Q
p
for all p .
Proof. One implication follows easily from the embedding Q Q
p
. For the other
one we have a look on the prime factorization of :
=

p<
p
(;p)
.
If is a square in the eld of real numbers, is positive. Furthermore, if is a
square in Q
p
, the exponent (; p) is even. This proves the other implication.
Theorem 35 has interesting consequences for the structure of p-adic elds.
Exercise 40 Show that two distinct p-adic elds are non-isomorphic. Further,
prove that R and any Q
p
are non-isomorphic.
Finite extensions of Q are called global, completions of global elds with discrete
valuation and nite residue eld are local. The so-called local-global principle
is the idea of putting together information from all local elds Q
p
and additionally
R = Q

, to get information in the global eld Q. This principle is extraordinary


successful and seems to go back to Hensel but was rst clearly stated by Hasse.
Theorem 35 is only the very beginning but gives a rst glimpse of its power. With
this concept Hasse was able to give around 1920 an important characterization of
so-called isotropic quadratic forms. Let K be a eld. Then we say that a quadratic
form P K[X
1
, . . . , X
r
] is isotropic over K if there exist x
1
, . . . , x
r
K, not all
x
k
= 0, such that
P(x
1
, . . . , x
r
) = 0.
It can be shown that isotropic quadratic forms take all values, i.e., for any K,
the equation
P(X
1
, . . . , X
r
) =
has a solution in K. A rst but unsatisfying classication of isotropic quadratic
forms was found by Minkowski in 1890. He proved that if P(X
1
, . . . , X
r
) is a
quadratic form with integer coecients for which (69) with any prime p has a non-
trivial solution, and if P(X
1
, . . . , X
r
) has a non-trivial real solution, then P is
isotropic over Q. In view of Theorem 34 this can be translated into the celebrated
theorem of Hasse-Minkowski which gives a particular solution of Hilberts eleventh
problem on representations of integers in algebraic number elds by quadratic forms.
Theorem 36 A quadratic form is isotropic over Q if and only if it is isotropic over
all Q
p
, p .
Hasses p-adic proof is much easier and more natural than Minkowskis approach,
but anyway even Hasses proof is far beyond the scope of these notes (it uses deeper
knowledge on the theory of quadratic forms, the multiplicative structure of Q
p
and
72
its subgroup of squares, and even Dirichlets prime number theorem for arithmetic
progressions). We refer the interested reader to [50].
It is easily seen that the ring of p-adic units in Z
p
is given by
Z

p
:= Q
p
: [[
p
= 1.
One can show that if p is an odd prime, then any quadratic form in at least three
variables with at least three coecients in Z

p
is isotropic over Q
p
. This is not only
helpful in the proof of the theorem of Hasse-Minkowski but also a useful tool for
concrete computations.
Exercise 41 Prove that the equation
3X
2
5Y
2
7Z
2
= 0
has a non-trivial solution in rational numbers x, y, z , while it fails to have a solution
dierent from the trivial one when we change the sign by 5Y
2
.
We note some classic consequences of the theorem of Hasse-Minkowski:
Lagrange proved that every positive integer can be written as a sum of at most
four squares;
Gauss showed that every positive integer has a representation as a sum of at
most three triangle numbers.
For proofs of these corollaries see also [50]. The local-global principle has also
limitations. For example, by the theory of quadratic residues it is easily seen that
the equation
(X
2
2)(X
2
17)(X
2
34) = 0
has solutions in all Q
p
, p , but obviously it has no solution in Q.
18 Hensels lemma
Recall the construction of the 7-adic

2 from the last section. In a sense, solving


the linear congruences step by step can be understood as applying the well-known
Newton iteration method from real analysis to the polynomial P(X) = X
2
2.
Starting with x
1
= 3 the iteration
x
n
= x
n1

P(x
n1
)
P

(x
n1
)
= x
n1
p
P(x
n
)
p
(P

(x
n1
))
1
,
yields
x
2
= 3 7
3
2
2
7
6
1
= 3 + 1 7,
where 6
1
= 1 has to be understood as an equation in Z/7Z. Continuing this
iteration process gives the same 7-adic value as we computed above. This observa-
tion leads to a very important theorem due to Hensel, called Hensels lemma (which
is also very useful in the proof of Theorem 36).
73
Theorem 37 Let P(X) be a polynomial with coecients in Z
p
and suppose that
there is a p-adic integer x
1
such that
P(x
1
) 0 mod pZ
p
and P

(x
1
) , 0 mod pZ
p
.
Then there exists a p-adic integer x with
x x
1
mod pZ
p
and P(x) = 0.
Hensels lemma is very likely the most important algebraic property of the p-adic
numbers. In many circumstances one can decide very easily whether a polynomial
has roots in Z
p
. The test involves nding an approximation x
1
of a root x and then
verifying a condition on the (formal) derivative of the polynomial in question. This
is certainly a highlight in combining diophantine approximations and diophantine
equations! Moreover, where Newtons iteration method can fail (if the starting point
of the iteration is not carefully chosen or the graph of the polynomial is not convex in
a small neighbourhood of the root) Hensels lemma always succeeds if the conditions
are fullled.
Proof. The existence of the root x will follow from a construction of an appropriate
Cauchy sequence converging to x. More precisely, we have to show that there are
x
n
Z
p
satisfying
x
n
x
n+1
mod p
n
and P(x
n
) 0 mod p
n
Z
p
.
The existence of x
1
follows from the assumption of the theorem. For x
2
we do the
ansatz x
2
= x
1
+zp (as in the proof of Theorem 34). In view of the formal identity
(resp. Taylor expansion)
P(X + h) = P(X) + hP

(X) +
h
2
2!
P

(X) + . . . ,
we get
P(x
2
) = P(x
1
+ zp) P(x
1
) + zpP

(x
1
) mod p
2
Z
p
.
Since P(x
1
) 0 mod pZ
p
there is some y for which P(x
1
) = yp. Thus we have to
solve the linear congruence
p (y + zP

(x
1
)) 0 mod p
2
Z
p
,
resp.
y + zP

(x
1
) 0 mod pZ
p
.
Since P

(x
1
) is not divisble by p it is invertible in Z
p
. Thus we can take z with
0 z < p and
z y(P

(x
1
))
1
mod pZ
p
.
This denes x
2
and by the same procedure we obtain the desired sequence by
induction. It is clear that this sequence is Cauchy and, by continuity, the limit x
satises P(x) = 0. Hensels lemma is proved.
74
A nice application of Hensels lemma is to determine the roots of unity in Q
p
.
Recall that is called an n-th root of unity if
n
= 1. We have to study
P(X) = X
n
1. For an n-th root of unity we have [
n
[
p
= 1, and therefore
Z

p
. It is easily seen that the roots of unity in Q
2
are given by = 1. The
cases of odd primes is a bit more dicult. Suppose that p 3 and let x Z
p
.
Obviously, P

(x) = nx
n1
is congruent zero modulo pZ
p
if either p divides x, in
which case x will not be a root of P anyway, or p divides n.
First, suppose that p and n are coprime. Then, by Hensels lemma, for every
x
1
, 0 mod pZ
p
we can nd an x Z
p
for which P(x) = 0 and x x
1
mod pZ
p
.
It follows that the equation P(X) = 0 has as many distinct solutions in Z

p
as in
the group of prime residue classes. In view of Fermats little theorem (59) (Z/pZ)

consists exactly out of p 1 distinct (p 1)-th roots of unity. Consequently, each


one corresponds to a (p 1)-th root of unity in Q
p
.
Now assume that n and p are not coprime. Without loss of generality we may
suppose that n = p (since each p-th root of unity is also a (kp)-th root of unity),
and it remains to show that we cannot nd any p-th root of unity ,= 1 in Q
p
. Let
x = x
1
+ zp, where 0 x
1
< p and z Z
p
. Then
x
p
x
p
1
=
p

k=1
_
p
k
_
x
pk
1
(zp)
k
,
which obviously lies in pZ
p
. If x
p
= 1, then it follows from (59) that x
1
= 1. This
leads to
1 = x
p
= (1 + zp)
p
= 1 + zp
2
+
p1

k=2
_
p
k
_
(zp)
k
+ z
p
p
p
.
Suppose that z ,= 0, then
zp
2
=
p1

k=2
_
p
k
_
(zp)
k
+ z
p
p
p
,
and, by (61),
[z[
p
2 = [ zp
2
[
p
max
2kp1
_

_
p
k
_
z
k
p
k

p
, [z
p
p
p
[
p
_
.
A short computation gives a contradiction. So we have z = 0 which implies x = 1.
Thus we have proved
Theorem 38 Let p be an odd prime. Q
p
contains exactly the (p 1)-th roots of
unity.
In particular, we see that any Q
p
is not algebraically closed. (It would have been
possible to see this before! Where?) This leads to new adventures (as in the real
case) but this is another story...
You remember how he discovered the North Pole; well, he was so proud of this that
he asked Christopher Robin if there were any other Poles such as a bear of little
brain might discover. (A.A. Milne, Winnie-the Pooh)
75
References
[1] Archimedes, The cattle problem, in English verse by S.J.P. Hillion and H.W.
Lenstra, Mercator, Santpoort 1999
[2] A. Baker, Contributions to the theory of Diophantine equations II. The Dio-
phantine equation y
2
= x
3
+ k, Phil. Trans. Roy. Soc. London 263 (1967/68),
193-208
[3] A. Baker, Transcendental number theory, Cambridge University Press 1975
[4] A. Baker, G. W ustholz, Number theory, transcendence and diophantine
geometry in the next millenium, 1-12, in Mathematics: frontiers and perspec-
tives, V. Arnold et al. (editors), AMS 2000
[5] L. Berggren, J. Borwein, P. Borwein, : a scource book, Springer 1997
[6] E. Bombieri, Roths theorem and the abc-conjecture, preprint ETH Z urich,
1994
[7] D.M. Bressoud, Factorization and primality testing, Springer 1989
[8] P. Bundschuh, Einf uhrung in die Zahlentheorie, Springer 1991, 2. Auage
[9] E.B. Burger, Exploring the number jungle: a journey into Diophantine
Analysis, AMS 2000
[10] H. Cohen, A course in Computational Algebaric number theory, Springer 1993
[11] J. Dieudonne, Geschichte der Mathematik 1700-1900, Vieweg 1978
[12] Ebbinghaus et al., Zahlen, Springer 1992, 3rd ed.
[13] N.D. Elkies, ABC implies Mordell, Intern. Math. Research Not. 7 (1991),
99-109
[14] C. Elsner, J.W. Sander, J. Steuding, Kettenbr uche als Summen eben-
solcher, Math. Slov. 51 (2001), 281-293
[15] P. Erd os, Representation of real numbers as sums and products of Liouville
numbers, Michigan Math. J. 9 (1962), 59-60
[16] C.J. Everett, Fermats conjecture, Roths theorem, Pythagorian triangles,
and Pells equation, Duke J. 801-804
[17] M. van Frankenhuysen, The abc-conjecture implies Roths theorem and
Mordells conjecture, Math. Contemp. 16 (1999), 45-72
[18] F.Q. Gouvea, p-adic numbers, Springer 1993
76
[19] A. Granville, T.J. Tucker, Its as easy as abc, Notices A.M.S. (2002),
1224-1231
[20] M.Jr. Hall, On the sum and product of continued fractions, Ann. of Math.
28 (1947), 966-993
[21] G.H. Hardy, E.M. Wright, An introduction to the theory of numbers, Ox-
ford University Press 1938
[22] K. Hensel,

Uber eine neue Begr undung der Theorie der algebraischen Zahlen,
Jahresberichte Deutschen Mathematiker Vereinigung 6 (1897), 83-88
[23] E. Hlawka, Theorie der Gleichverteilung, BIB 1979
[24] M.N. Huxley, The distribution of prime numbers, Oxford 1972
[25] K. Ireland, M. Rosen, A classical introduction to modern number theory,
Springer 1990, 2nd ed.
[26] J.P. Jones, Y.V. Matijasevi c, Proof of recursive unsolvability of Hilberts
tenth problem, Amer. Math. Monthly 98 (1991), 689-709
[27] A. Khintchine, Kettenbr uche, Teubner 1956
[28] N. Koblitz, A course in Number theory and Cryptography, Springer 1994, 2nd
ed.
[29] S. Lang, Introduction to diophantine approximations, Springer 1995, 2n ed.
[30] S. Lang, Complex Analysis, Springer 1999, 4th ed.
[31] H.W. Lenstra, Solving the Pell equation, Notices Amer. Math. Soc. 49
(2002), 182-192
[32] A.A. Markoff, Sur les formes binaires indenies I+II, Math. Ann. 15 (1879),
281-309; 17 (1880), 379-400
[33] R.C. Mason, Diophantine equations over function elds, LMS Lecture Notes
96, Cambridge University Press 1984
[34] P. Mih ailescu, Primary cyclotomic units and a proof for Catalans conjecture,
preprint
[35] L.J. Mordell, Diophantine equations, Academic Press 1969
[36] M.B. Nathanson, Polynomial Pell equations, Proc. A.M.S. 56 (1976), 89-92
[37] I. Niven, A simple proof that is irrational, Bull. Amer. Math. Soc. 53 (1947),
509
77
[38] O. Perron, Die Lehre von den Kettenbr uchen, vol. I, Teubner 1954
[39] A. van der Poorten, A proof that Euler missed, Math. Intelligencer 195-203
[40] P. Ribenboim, The book of prime number records, Springer 1989, 2nd ed.
[41] G.J. Rieger, Zahlentheorie, Vandenhoeck & Ruprecht, G ottingen 1976
[42] K.F. Roth, Rational approximations to algebraic numbers, Proceedings of the
ICM 1958, Edinburgh, Cambridge University Press, 203-210
[43] H.P. Schlickewei, The subspace theorem and applications, Doc. Math. J.
DMV, Extra volume ICM 1998, vol. II, 197-206
[44] W.M. Schmidt, Diophantine Approximation, Springer 1980, Lecture Notes
785
[45] T. Schneider, Einf uhrung in die transzendenten Zahlen, Springer 1957
[46] M.R. Schroeder, Number theory in science and communications, Springer
1997, 3rd ed.
[47] F. Schwarz, Wieviele Rinder hat der Sonnengott?, Mitteilungen der
Deutschen Mathematiker Vereinigung 2 (1997), 13-18
[48] W. Schwarz, Geschichte der Zahlentheorie, www.math.uni-frankfurt.de/ steud-
ing/schwarz.html
[49] C. Series, The geometry of Marko numbers, Math. Intelligencer 7, no.3
(1985), 20-29
[50] J.P. Serre, A course in Arithmetic, Springer 1973
[51] J.H. Silverman, J. Tate, Rational points on elliptic curves, Springer 1992
[52] S. Singh, Fermats last theorem, Fourth Estate, London 1997
[53] R.

Slezevicien e, J. Steuding, Simpsons paradox in the Farey sequence,
preprint
[54] R.

Slezevicien e, J. Steuding, Factoring with continued fractions and the
Pell equation, in preparation
[55] J. Steuding, The world of p-adic numbers and p-adic functions, Proc. Sci.
Seminar Fac. Phys. Math.

Siauliai Univ. 5 (2002), 90-107
[56] J. Steuding, A note on the Polynomial Pell equation, preprint 2003
[57] C.L. Stewart, Kunrui Yu, On the abc conjecture, II, Duke J. 108 (2001),
169-
78
[58] W.W. Stothers, Polynomial identities and hauptmoduln, Quart. J. Math. 32
(1981), 349-370
[59] P. Vojta, Nevanlinna theory and diophantine approximation, Several complex
variables, Berkely, CA, 1995/96, Math. Sci. Res. Inst. Publ. 37, Cambridge
University Press 1999
[60] G. W ustholz, Ausgewahlte Kapitel aus der Zahlentheorie und Geometrie,
Lecture Notes 1996, www.math.uga.edu/ ntheory/N4.html
[61] W. Zudilin, Arithmetic of linear forms involving odd zeta values, (2001),
arXiv:math.NT
79

You might also like