Costaricalectures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Maximal Curves and Applications to Coding

Theory
Rachel Pries and Beth Malmskog
January 2011

Error-correcting codes are used to reliably transmit data in many ways, from
cell phones to satellites. One way to construct good error-correcting codes is to use
number theory and algebraic geometry. In this course, we will learn about curves
defined over finite fields and their applications to coding theory.

1. Finite fields
2. Error-correcting codes
3. Points on curves
4. Reed-Solomon codes
5. Goppa codes and maximal curves
6. Research topics

References: Hardy and Walker, Codes, Ciphers and Discrete Algorithms, Discrete
Mathematics and its Applications, Prentice Hall, 2009.

Walker, Codes and Curves, Student Mathematical Library volume 7, IAS/Park


City Mathematical Subseries, American Mathematical Society, 2000.

1
1 Finite fields
Definition 1.1. A field is a commutative ring F with identity such that every
non-zero element has a multiplicative inverse.
In other words, a field is a set of numbers, with two binary operations (addi-
tion and multiplication) satisfying certain associative, distributive, and commutative
rules. In practice, this means we can add, subtract, multiply, and divide by anything
except 0. Let’s think of some examples.

The field Z/p In this course, the most important examples are the finite fields
Z/p, where p is a prime number. Recall that Z/p is the set {0, 1, 2, . . . , p − 1} with
binary operations of addition and multiplication modulo p. Another viewpoint is
that Z/p is the set of equivalence classes of integers, under the equivalence relation
of congruence modulo p.
There are several ways to show that Z/p is a field. In one method, given a 6≡
0 mod p, you show that a must have a multiplicative inverse. To do this, you solve
the Diophantine equation ax + py = 1.
Here is an important property of the field Z/p.
Proposition 1.2 (Fermat’s Little Theorem). If α ∈ Z/p then αp − α = 0.
Corollary
Qp−1 1.3. In the polynomial ring Z/p[x], the polynomial xp − x factors as
α=0 (x − α).

Factorization Let k be a field (for example Q or Z/p) and let k[x] be the ring of
polynomials with coefficients in k.
Definition 1.4. A polynomial g(x) factors in k[x] if g(x) = h1 (x)h2 (x) where h1 (x)
and h2 (x) are polynomials in k[x] whose degree is less than deg(g(x)). If g(x) does
not factor over k, then it is irreducible in k[x] A polynomial is monic if its leading
coefficient equals 1.
Theorem 1.5. The polynomial ring k[x] has unique factorization. If f (x) ∈ k[x] is
non-constant of degree d, then f (x) has at most d roots in k.
Proof. The main idea behind the proof is that k[x] has a Euclidean algorithm and
so is a principal ideal domain and a unique factorization domain.

Constructing fields It is possible to construct fields larger than k by taking


quotients of k[x] or, equivalently, taking algebraic extensions of k.
Theorem 1.6. Suppose k is a field and g(x) is a polynomial which is irreducible in
k[x]. Then the ideal I = hg(x)i of k[x] is maximal and the quotient ring k[x]/I is a
field.

2
Proof. Since k[x] is a principal ideal domain, every ideal I of k[x] is just the set of
multiples of some polynomial h(x). If g(x) is irreducible, this means that the only
ideals containing I are I itself and the whole ring k[x]. So I is maximal. With some
work, this shows that every non-zero coset of the quotient ring has a multiplicative
inverse.
Corollary 1.7. If g(x) is an irreducible polynomial of degree d in Z/p[x], then the
quotient ring Z/p[x]/hg(x)i is a field of size pd .
Example 1.8. The polynomial g(x) = x2 + x + 1 is irreducible in R = Z/2[x].
The quotient ring is a field of size 4. The cosets are represented by {0, 1, α, α + 1}
where α2 + α + 1 = 0. Every non-zero coset has an inverse since 1 · 1 = 1 and
α · (α + 1) = α2 + α ≡ −1 ≡ 1.
Example 1.9. The polynomial g(x) = x3 + x + 1 is irreducible in R = Z/2[x]. The
quotient ring is a field of size 8. The cosets are represented by

{0, 1, β, β + 1, β 2 , β 2 + 1, β 2 + β, β 2 + β + 1}

where β 3 +β+1 ≡ 0. The non-zero cosets form a group of size 7 under multiplication;
this group is cyclic (any non-zero coset is a generator).

The theory of finite fields There is a beautiful theory about finite fields, which
we could spend the whole course talking about. Here are some highlights.
Theorem 1.10. 1. If F is a finite field, then the size q of F equals pa for some
prime p and some positive integer a.

2. Given q = pa , the set Fq of roots of xq − x is a field of size q.

3. Given q = pa , the field Fq is the unique field of size q.

4. The multiplicative group F∗q is cyclic.


Proof. 1. Main reason: it’s a vector space over Z/p.

2. Main reason: the number of roots equals pa since the polynomial is separable.
The set of roots is closed under addition/subtraction, multiplication/division.

3. Given a field F of size q, then F∗ is a multiplicative group of size q − 1. If


α ∈ F∗ , then αq−1 = 1 by Lagrange’s Theorem. Then F ' Fq since every
element of F is a root of xq − x.

4. By the theorem of finite abelian groups, q − 1 is a multiple of the exponent d


of F∗q . Also every element of F∗q is a root of xd − 1; then q − 1 ≤ d since this
polynomial has at most d roots.

3
Definition 1.11. The Frobenius morphism π : Fq → Fq is the function given by
π(α) = αp .

Theorem 1.12. 1. The Frobenius morphism is an automorphism of Fq .

2. The Galois group of Fq over Fp is cyclic of order a and generated by π.

3. Fpm ⊂ Fpn if and only if m divides n.


n
Corollary 1.13. Let q = pa and let R = Z/p[x]. The polynomial xp − x factors
in R into the product of all the monic irreducible polynomials of R of degree m, for
every m dividing n, with each factor appearing once.

Example 1.14. In Z/2[x], the factorization is x8 −x = x(x−1)(x3 +x+1)(x3 +x2 +1).

4
Problem Session 1
1. Find:

(a) a polynomial of degree 2 which is irreducible over Q but factors over Z/5.
(b) a polynomial of degree 4 that factors over Z/2 but has no roots.

2. (a) Find a polynomial g(x) of degree 2 in R = Z/3[x] which has no roots.


(b) Explain why I = hg(x)i is a maximal ideal of R.
(c) What are representatives of the cosets in the quotient ring K = R/I?
(d) Find an inverse of each coset to demonstrate explicitly that K is a field.
(e) Find a coset γ + I which generates K ∗ . In other words, if c + I is a
non-zero coset in R/I, then c + I = (γ + I)e for some exponent e.
(f) Optional: find a ring isomorphism φ : R/I → Z[i]/h3i.

3. Find the three monic polynomials in Z/3[x] which have degree 2 and are
irreducible over Z/3. Call them g1 , g2 , g3 and compute the product

(x3 − x)g1 g2 g3 .

4. (a) Find the roots of x2 − 1 in Z/8. Why is the answer surprising?


(b) Find two different factorizations of x2 − 1 in Z/8[x].

Further investigation
If you like this topic, here are some more problems for you.

1. (a) For which primes p, does x2 + 1 factor over Z/p? Find some data and
make a conjecture.
(b) For which primes p, does x2 − 2 factor over Z/p? Find some data and
make a conjecture.

2. (a) The polynomial x5 − x factors as x(x − 1)(x − 2)(x − 3)(x − 4) over Z/5.
Use the degree 1 terms of these polynomials to prove that 4! ≡ −1 mod 5.
(b) Prove Wilson’s Theorem: If p is prime, then (p − 1)! ≡ −1 mod p.

5
2 Error Detecting and Correcting Codes
Errors often occur when data is transmitted. For example, the clerk transposes two
numbers when entering the universal product code on your milk at the store, or your
CD player gets bumped when you are listening to your favorite album, or some bits
of data are flipped when a telescope beams the data of images of Saturn back to
earth. Error detecting and error correcting codes are mathematical ways to encode
information for transmission so that the receiver can at least detect and hopefully
even correct errors in transmitted data.

ISBN codes One everyday example of an error detecting code is the International
Standardized Book Number (ISBN). Each published book is assigned a 10 digit
ISBN which is printed on the back cover of the book. For example, the ISBN
for The Heart of Mathematics is 0-470-49951-1. The first nine digits are actually
information about the book, while the last digit is a check digit, chosen based on
the first nine digits. If any mistakes are made in entering the number, a computer
can easily be programmed to detect that something is wrong and display an error
message. So if a clerk who was logging inventory at a book store made a mistake
entering the ISBN for a book, she could instantly know that she had goofed, and be
prompted to reenter the number. How does this work? Let ai be the i-th digit of
the ISBN. Then we calculate a010 = a1 + 2a2 + 3a3 + ... + 9a9 (mod 11). If 0 ≤ a010 ≤ 9,
let a10 = a010 . If a010 = 10, use a10 = X.

We can see that this will detect any single error in entering the ISBN as follows.
Say that ai is replaced by bi = ai + k for some k. When the computer sees the first
nine digits of this corrupted ISBN, it will calculate that a010 should equal a1 + 2a2 +
3a3 + ... + iai + ik + ... + 9a9 . This is the real value of a010 plus ik. The only way
that the two check digits could match is if ik ≡ 0 mod 11. Since Z/11 is a field, this
is only possible if i = 0 or k = 0. Since i 6= 0, the digits will only match if k = 0,
meaning no error has occurred.

Repetition Codes The first idea that we might have is to send our information
twice. The receiver could check whether the two copies match. If not, there must
have been a mistake. But which one has the mistake? Not so easy to tell, when
the information is a string of 0s and 1s. Maybe we send each symbol 3 times, and
look for differences in any of the copies. If there is a discrepancy, we could assume
that the mistake most likely occurred in only one copy and go with the majority.
We could correct many errors this way. However, this isn’t very efficient. We have
to send three times as much data as we actually want to transmit. Coding theory
searches for the most efficient solutions to the problem of errors in data.

Definitions

6
Definition 2.1. A code C over and alphabet A of length n is a subset of An . Elements
of a code are called codewords. We denote the number of codewords in C by #C.

For the codes we consider, the alphabet will generally be a finite field Fq for q a
prime power.

Example 2.2. Let A = F2 , n = 4 and let C be the set {(1, 0, 0, 1), (0, 1, 1, 0), (1, 1, 1, 1), (0, 0, 0, 0)}.

This is an example of a linear code.

Definition 2.3. A function f (x) is called linear if it possesses the following two
properties:

• For all x and y in the domain of f , f (x) + f (y) = f (x + y).

• For any constant α, f (αx) = αf (x).

Example 2.4. The simplest examples of linear functions are given by linear poly-
nomials: f (x) = ax + b for constants a and b.

Definition 2.5. A linear code over an alphabet A of length n is a code C which


possesses the following two properties:

• The all zeros vector is a codeword in C.

• For all ~x = (x1 , x2 , ..., xn ) and ~y = (y1 , y2 , ..., yn ) which are codewords in C,
~x + ~y = (x1 + y1 , x2 + y2 , ..., xn + yn ) is also a codeword in C.

• For any constant α in A and codeword x in C, α~x = (αx1 , αx2 , ..., αxn ) is also
a codeword in C.

In other words, a code is linear if the codewords make up a linear subspace of


the vector space An .

Referring back to example 2.2, we can see that though #C = 4, we could find
all the codewords of C by knowing that C is a linear code that contains (1, 0, 0, 1)
and (0, 1, 1, 0). We say that (1, 0, 0, 1) and (0, 1, 1, 0) are generators of C, i.e. they
generate C as avector space. To write down C as compactly as possible, we might
1 0 0 1
use the matrix .
0 1 1 0
Definition 2.6. The dimension of a linear code C is the minimal number of code-
words that are needed to generate all the codewords of C using definition 2.5. The
dimension of a code is its dimension as a linear subspace. A minimal set of code-
words that will generate C is called a basis for C. The length n and dimension k of
a code are called its parameters. A k × n matrix with rows consisting of a basis of
C is called a generator matrix for C.

7
Distances in Codes It will help us to think about codes geometrically. We can
think of the codewords as points in the space An . How do we understand the distance
between codewords?

Definition 2.7. For ~x = (x1 , x2 , ..., xn ) and ~y = (y1 , y2 , ..., yn ) in An , the Hamming
distance between ~x and ~y is defined to be

d(~x, ~y ) = #{i : xi 6= yi }.

Let ~0 be the all zeros vector of length n. The Hamming weight of x is defined to be
wt(~x) = d(~x, ~0) = #{i : xi 6= 0}.

Definition 2.8. The minimum distance of C is defined as

dmin (C) = min{wt(~x) : ~x ∈ C}.

It’s straight forward to see that the minimum distance of a code is equal to
the minimum weight of any codeword of C excluding ~0. If a code C has minimum
distance d, then for any ~x in C, there is no codeword ~y in C so that d(~x, ~y ) < d. We
can think of this as a ball of radius d−1
2
centered at each codeword ~x such that ~x is
the closest codeword to any point of An in this ball.

Definition 2.9. A ball Br (~x) of radius r about a codeword ~x is the set of all ~y in
An so that d(~x, ~y ) ≤ r.

Say that we are using C to encode our information and we receive a transmission
x , which is not a codeword. If ~x0 is in B d−1 , that is, if d(~x, ~x0 ) ≤ d−1
0
2
, we can assume
2
that the original codeword was likely to be x. Thus the minimum distance of a code
determines the number of errors that the code can correct. A code with minimum
distance d can correct up to d−12
errors. The minimum distance also tells us about a
code’s ability to detect errors, even if it can’t correct them properly. If the smallest
distance between two codewords is d, then a codeword would need to be affected
by at least d errors before the transmission could appear to be a different valid
codeword. So a computer could be programmed to notify the recipient that errors
occurred as long as there are d − 1 or fewer of them.

Bounds on codes We want error correcting codes to be efficient, meaning that


we want to transmit as few extra symbols as possible. For a linear code of length n
and dimension k, we can think of each codeword as having k symbols of information
and n total symbols. We would like k to be large with respect to n. We also want
error correcting codes to be able to catch and correct a lot of errors. The longer the
codewords are, the more errors we would be likely to see. If the minimum distance
of the code is d we would like d to also be large with respect to n. As you might
guess, there is some trade-off between k and d.

8
Theorem 2.10. (The Singleton Bound) Let C be a linear code over Fq of length n,
dimension k, and minimum distance d.

d ≤ n − k + 1.

Proof of the theorem. Let W be the linear subspace of Fnq which has zeros in the
d-th through n-th components. That is, let

W = {~x = (x1 , x2 , ..., xn ) ∈ Fnq : xd = xd+1 = ... = xn = 0}.

We can see that W is a (d − 1)-dimensional subspace of Fnq , and wt(~x) ≤ (d − 1) for


any ~x in W , meaning x is not in C. Therefore W ∩ C = {~0}. Let

W + C = {w ~ ∈ W, ~c ∈ C}.
~ + ~c : w

The dimension of W + C is k + d − 1, so k + d − 1 ≤ n. Therefore d ≤ n − k + 1.

9
Problem Session 2
1. The first nine digits of the ISBN for a book are 0-312-59034. What is the
check digit?

2. You have 9 stacks of 9 coins, all of which appear to be identical. One of these
stacks is made up of all fake coins, while the rest of them are real gold. The
real coins weigh 10 grams each, while the counterfeits weigh 9 grams each.
You have a scale that will tell you the mass of any combination of the coins,
but you want to be clever and find out which stack contains the fake coins in
the least possible number of weighings. How do you do it?

3. Consider the code C over F2 with the following generator matrix, also called
C:  
1 0 0 0 1 1 0
 0 1 0 0 1 0 1 
C=  0 0 1 0 0 1

1 
0 0 0 1 1 1 1

(a) Find the parameters n and k of C.


(b) What is the minimum distance d of C?
(c) You receive the transmission (1, 0, 0, 1, 1, 1, 0). Is this a codeword? If not,
what was the intended codeword?
(d) How many codewords are there in C?
(e) A code that meets the Singleton bound, that is a code for which d =
n − k + 1, is called a Maximum Distance Separable code (MDS code). Is
C an MDS code?
(f) How many errors can C correct? How many can it detect?

4. Prove that the minimum distance of a linear code C is the same as the mini-
mum weight of any codeword in C except ~o.

5. How many codewords are there in a linear code of dimension k over Fq ?

6. Given a linear code C, we now have experience finding the length, dimension,
minimum distance, and number of codewords in C. But given arbitrary length,
dimension, minimum distance, and number of codewords, does a code C exist
with these parameters? Not always.

Definition 2.11. The quantity Aq (n, d) is the maximum value of M such that
there is a linear code of length n with M codewords and minimum distance d.

Use the Singleton bound and exercise 5 to give an upper bound on Aq (n, d)
for linear codes in terms of n, q, and d.

10
7. Recall that, if our alphabet is Fq , we defined the ball about a point ~x of radius
r to be Br (~x) = {~y ∈ Fnq : d(~x, ~y ) ≤ r}.

Definition 2.12. The quantity Vq (n, r) is the number of points of Fnq in the
ball of radius r centered at any point of Fnq .

Prove that r  
X n
Vq (n, r) = (q − 1)i
i=0
i
where  
n n!
=
k k!(n − k)!
is the binomial coefficient that counts the number of (unordered) different
ways to choose k things from a set of n.

The previous two exercises give us the notation to state and understand the
Gilbert-Varshamov Bound:

qn
Aq (n, d) ≥ .
Vq (n, d − 1)

11
3 Points on curves
Let k be a field. In these talks, k will be a finite field Fq where q = pa where p is
prime, for example, k = Z/p.

Planar curves Informally, a planar curve is the set of points (x, y) satisfying a
polynomial equation f (x, y) ∈ k[x, y] in two variables.

Example 3.1. (a) Rational curve y = h(x).

(b) Hyperelliptic curve y 2 = h(x).

(c) Hermitian curve y p + y = xp+1 .

Points on curves

Example 3.2. Recall that F4 = {0, 1, α, α + 1} where 2 ≡ 0 and α2 + α + 1 ≡ 0.


Let’s find the F4 -points of the curve y 2 + y = x3 .

y 0 1 α α+1
2
y +y 0 0 1 1

x 0 1 α α+1
x3 0 1 1 1

So there are 8 points: (0, 0), (0, 1), (1, α), (1, α + 1), (α, α), (α, α + 1), (α + 1, α),
(α + 1, α + 1).

Points at infinity One drawback to working with planar curves is that they are
not projective (complete, compact). To fix this, we need to study points at infinity.
The points at infinity are in bijection with slopes of asymptotes to the curve. Here
is the method to find them.

Step 1: Homogenize the equation by capitalizing x and y and adding powers of a


new variable Z.

Step 2: Set Z = 0 and find the conditions this places on X and Y .

Step 3: The points at infinity are solutions [X, Y, 0] to the homogenized equation,
with X, Y not both zero, up to scalar equivalence.

Example 3.3. There is one point at ∞ on the curve y 2 + y = x3 . The homogenized


equation is Y 2 Z + Y Z 2 = X 3 . If Z = 0, then X = 0. So the solutions are [0, Y, 0].
Up to scalar equivalence, there is one point at infinity, namely [0, 1, 0].

12
Why does this work? To describe it precisely, we would define the projective
plane P2 . Its points are of the form [X, Y, Z] with X, Y, Z ∈ k not all 0, where two
points are equivalent if one is a scalar multiple of the other. A projective curve in P2
is the set of equivalence classes of points [X, Y, Z] satisfying a homogenous equation
F (X, Y, Z). The affine plane is contained in the projective plane A2 ⊂ P2 . Given a
planar curve f (x, y), there is a unique projective curve F (X, Y, Z) whose restriction
to A2 is f .

Singular points

Definition 3.4. A planar curve f (x, y) is singular at a point on the curve if both
partial derivatives of f vanish at that point. A projective curve F (X, Y, Z) is singular
at a point on the curve if all three partial derivatives vanish at that point.

Example 3.5. There are no singular points on the curve y 2 + y = x3 because fy = 1


everywhere. For the projective curve Y 2 Z + Y Z 2 = X 3 , the partial derivatives
are FX = −3X 2 ≡ X 2 , FY = Z 2 and FZ = Y 2 . These partial derivatives are
simultaneously zero only at [0, 0, 0] which is not a valid point of projective space.

The genus The genus of a complex Riemann surface equals the number of its
’handles’ and is a topological invariant. It is also possible to define the genus of a
smooth projective curve defined over Fp ; it is the dimension of the vector space of
holomorphic 1-forms. Naively speaking, it measures how complicated the curve is.

Lemma 3.6. (a) The genus of a rational curve y = h(x) is 0.

(b) The genus of a hyperelliptic curve y 2 = h(x) where deg(h(x)) = d is b(d−1)/2c.

(c) Plücker formula: If C is a smooth projective curve of degree d in P2 then


g = (d − 1)(d − 2)/2.

The Riemann-Hurwitz formula is a good method to find the genus of a curve.

13
Problem Session 3:
1. Let’s count the F9 points on the curve y 3 + y = x4 . Write Z/3 = {0, ±1} and
F9 = {a + bγ | a, b ∈ Z/3} where γ 2 = −1.

(a) Fill in these tables:

x 0 ±1 ±γ ±(1 + γ) ±(1 − γ)
x4

y 0 1 −1 γ γ + 1 γ − 1 −γ −γ + 1 −γ − 1
3
y +y

(b) Use the tables to find all the solutions (x, y) to y 3 +y = x4 when x, y ∈ F9 .
(c) Find the points at infinity for the curve y 3 + y = x4 .
(d) How many F9 -points does the curve y 3 + y = x4 have?
(e) Does the curve y 3 + y = x4 have any singular points?

2. Consider the curve defined by the equation y 2 = x3 + x.

(a) Give the homogenous form of the above equation and find the points at
infinity.
(b) Show that the curve is smooth as long as the characteristic is not 2.
(c) Determine the genus of the curve.

3. Consider the curves defined by y 2 = x3 + x2 and y 2 = x3 .

(a) Draw the graphs of the curves in R2 .


(b) Determine the singular points of the curves.

14
4 Reed-Solomon codes
Reed-Solomon codes were invented in 1960 at MIT Lincoln Lab. At that time,
technology was too weak to implement them. Their first uses (in the early 1980s)
were for digital photos for Voyager space probe and compact disks.

Notation 4.1. Let q = pn be a prime power and label the non-zero elements of Fq
as α1 , . . . , αq−1 . Let k be such that 1 ≤ k ≤ q −1. Let Lk−1 be the set of polynomials
g(x) with deg(g) ≤ k − 1. Given f ∈ Lk−1 , let f (~ α) = (f (α1 ), . . . , f (αq−1 )).

Definition 4.2. The Reed-Solomon code RS(k, q) is the subset of Fq−1


q consisting
α) for f ∈ Lk−1 .
of all vectors f (~

Lemma 4.3. (i) Lk−1 is a vector space over Fq of dimension k.

(ii) RS(k, q) is a linear code over Fq of length q − 1 and dimension k.

(iii) The number of codewords is q k .

Proof. (i) A basis for Lk−1 over Fq is {1, x, x2 , . . . , xk−1 }.

(ii) Suppose f1 (~
α) and f2 (~
α) are in RS(k, q). Then f1 (x) and f2 (x) are in Lk−1 .
By part (i), Lk−1 is a vector space over Fq . So if c ∈ Fq , then cf1 + f2 ∈ Lk−1 .
Then cf1 (~
α) + cf2 (~ α) ∈ RS(k, q). Thus RS(k, q) is a linear
α) = (cf1 + f2 )(~
code over Fq . The dimension of RS(k, q) is the same as the dimension of Lk−1 .
A basis for RS(k, q) is given by the vectors (α1i , . . . , αq−1
i
) for 1 ≤ i ≤ k − 1.

Example 4.4. Let q = 5 and k = 3. Let α1 = 1, . . . , α4 = 4. A basis for L2 is


{1, x, x2 }. Let f1 (x) = 1, let f2 (x) = x, and f3 (x) = x2 . Then f1 (~α) = (1, 1, 1, 1),
and f2 (~α) = (1, 2, 3, 4), and f3 (~
α) = (1, 4, 4, 1). These three vectors are a basis for
RS(2, 5). This is a code over Z/5 with length 4 and dimension 3.

Remark 4.5. Notice that Lk−1 is the vector space of functions on P1 whose only
poles are at ∞ and such that the order of the pole at ∞ is at most k − 1. The
entries of a codeword are the values of the function at the (non-zero) points of P1 .

Transmitting data One interesting thing about the Reed-Solomon code is that
all the entries are treated equally. Unlike the ISBN code or the (7, 4) code, there
are no ’check digits’. The data does not appear as part of the transmitted message.
So how does it work?

15
Transmitting data:
Pk−1
Definition 4.6. If ~c = (c0 , . . . , ck−1 ) is the data, then let f~c (x) = i=0 ci xi and the
code word is f~c (~
α).
Example 4.7. Suppose q = 5 and k = 3. Then we can transmit three pieces of
data, let’s say (c0 , c1 , c2 ). First we create a function f (x) = c0 + c1 x + c2 x2 . Then
the code word is (f (1), f (2), f (3), f (4)). Specifically, if the data is (4, 0, 1) then
f (x) = x2 + 4 and the code word is (0, 3, 3, 0).
This can be implemented easily using linear algebra.
Definition 4.8. The generator matrix Mk,q for RS(k, q) is the matrix with q − 1
columns and k rows constructed as follows: let f1 (x) = 1, f2 (x) = x, . . . , fk (x) =
xk−1 . The jth row of Mk,q is the codeword fk (~ α). Given a data vector ~c =
(c0 , . . . , ck−1 ), the code word is ~cMk,q .
Example 4.9.
1 1 1 1
M3,5 = 1 2 3 4
1 4 4 1

Interpreting data: Suppose the codeword is ~b = (b1 , . . . , bq−1 ). We need to find a


function f (x) with degree at most k−1 such that f (~α) = ~b, i.e., such that f (αi ) = bi .
Then the data is given by the coefficients of f (x).

Lemma 4.10. The data vector ~c is the solution to the linear system ~cMk,q = ~b.
Specifically, we can find the data vector ~c using row reduction.
Lemma 4.11. If the linear system is inconsistent, then there has been an error in
transmission.
Example 4.12. There is no vector in RS(5, 3) with weight 1. To see this, suppose
f (~
α) is a vector in RS(5, 3) with length 4 and weight 1; this means exactly three
entries are zero, e.g., (0, 0, 0, 2). Then f (x) has three roots in Z/5. This is impossible
since deg(f ) ≤ 2.

The minimal distance of a Reed-Solomon code Recall that the minimal dis-
tance of a code is the smallest non-zero Hamming distance between two codewords.
This measures how many entries are different between the codewords. It determines
the number of errors in transmission that can be detected and corrected. The dis-
tance of a Reed-Solomon code is optimal given the fixed length q − 1 and dimension
k.
Theorem 4.13. The Reed-Solomon code RS(k, q) has distance d = q − k.

16
Proof. By the Singleton bound, d ≤ n−k +1 where n is the length of the codewords.
Since n = q − 1, this implies d ≤ q − k.
To prove that d ≥ q − k, recall that the minimal distance of a linear code is the
same as the minimal weight. The weight of a codeword is the number of its entries
which are non-zero. So we need to show that if f (~α) is a non-zero codeword, then
the number of its entries which are non-zero is at least q − k. In other words, we
need to show that the number of roots of f (x) is at most (q − 1) − (q − k) = k − 1.
This is true since f (x) has degree at most k − 1.
The Reed-Solomon codes are very good codes. For fixed q and k, the distance of
the code RS(k, q) is optimal. One drawback of the Reed-Solomon codes is that, once
q is fixed, the length of the codewords is fixed at q − 1 and the dimension is bounded
by q − 1. Next time we will look for codes where the distance and dimension can be
large relative to q.

17
Problem Session 4:
1. Let’s investigate the Reed-Solomon code RS(2, 5).

(a) What is a basis for the codewords? How many codewords are there?
What is the matrix M2,5 ?
(b) If the data is ~c = (1, 3), what is the codeword?
(c) If the codeword is ~b = (1, 0, 4, 3), what is the data?
(d) If the codeword is ~b = (1, 0, 2, 4), show that an error occurred in trans-
mission. What is the best guess for the data vector ~c?
(e) What is the distance invariant for RS(2, 5)? How many errors can this
code detect? How many errors can this code correct?

2. What happens to RS(k, q) if we let k ≥ q? Find a basis for RS(5, 3).

3. If I want a Reed-Solomon code with at least 200 codewords that will correct
2 errors, what could the parameters be? What is the smallest choice of q?

4. This exercise helps to prove the assertion in the proof of Theorem 4.3 that the
dimension of RS(k, q) is k.

(a) Given α1 6= α2 ∈ Fq , find f1 , f2 polynomials of degree 1 such that

f1 (α1 ) = 1, f2 (α1 ) = 0,

and
f1 (α2 ) = 0, f2 (α2 ) = 1.
(b) Given α1 6= α2 6= α3 , find f1 a polynomial of degree 2 such that

f1 (α1 ) = 1, f1 (α2 ) = 0, f1 (α3 ) = 0.

(c) Explain why RS(k, q) contains codewords beginning with (1, 0, 0, 0, ...),
(0, 1, 0, 0, ...), etc.
(d) Explain why RS(k, q) has dimension ≥ k.

18
5 Goppa codes and maximal curves
In 1977, Goppa invented new error-correcting codes using algebraic geometry. The
strategy is to use a curve C and a set S of points on C defined over a finite field F.
The codewords are constructed by evaluating functions on C at the points of S.
Definition 5.1. Let C be a smooth projective curve over Fq . Let D = rP∞ where
P∞ is the point at infinity of C and r is a natural number. Let S = {α1 , . . . , αn } be
a set of distinct points of C defined over Fq , not including P∞ . (More generally, D
is a divisor of C, and S is a set of points disjoint from the support of D.) Let L(D)
be the set of functions on C having poles only at P∞ such that the order of the pole
at ∞ is at most r.
For f ∈ L(D), the codeword is f (S) ~ = (f (α1 ), . . . , f (αn )). The Goppa code is

~ | f ∈ L(D)}.
G(C, S, D) = {f (S)

The Reed-Solomon code evaluates functions on points on a line y = 0.


Theorem 5.2. Let g be the genus of C. If 2g − 2 < r < n, then the Goppa code
G(C, S, D) is a linear code of length n, dimension k = r+1−g and minimal distance
d for some d ≥ n − r.
Proof. The code is linear since L(D) is linear and has length n by definition. The
Riemann-Roch formula is needed to show that dim(L(D)) = k. The lower bound
on d comes from an upper bound on the number of zeros of a function whose poles
have order at most r.
Example 5.3. Recall that F4 = {0, 1, α, α + 1} where 2 ≡ 0 and α2 + α + 1 ≡ 0. Let
C be the curve Y 2 Z + Y Z 2 = X 3 and let P∞ be the point at ∞ of C. Other than
P∞ , there are eight F4 -points on C, namely α0 = (0, 0), α1 = (0, 1), α2 = (1, α),
α3 = (1, α + 1), α4 = (α, α), α5 = (α, α + 1), α6 = (α + 1, α), α7 = (α + 1, α + 1).
Let S = {α1 , . . . , α7 }.
In P2 , the line Z = 0 intersects C three times at P∞ ; the line Y = 0 intersects
C three times at α0 ; The line X = 0 intersects C at α0 , α1 and P∞ .
A basis for L(3P∞ ) is {1, X/Z, Y /Z}. The function f1 = 1 gives the codeword

(1, 1, 1, 1, 1, 1, 1).

The function f2 = X/Z gives the codeword

(0, 1, 1, α, α, α + 1, α + 1).

The function f2 = Y /Z gives the codeword

(1, α, α + 1, α, α + 1, α, α + 1).

This is a Goppa code of length 7, dimension 3, and d ≥ 4. In fact, d = 5.

19
For fixed q, to optimize a Goppa code, we would like to have n large in comparison
with g. In other words, we need C to have a lot of points defined over Fq in
comparison with g. But there are limits.
Theorem 5.4 (Hasse-Weil bound). If C is a smooth projective curve of genus g,

then the number of points of C defined over Fq is at most q + 1 + 2g q.
Definition 5.5. A curve C is maximal over Fq if it realizes the Hasse-Weil bound
over Fq .
Here is an example of a maximal curve.
Definition 5.6. Given a prime power q, the Hermitian curve Hq is the smooth
projective curve with affine equation y q + y = xq+1 . Its homogenous equation is
Y q Z + Y Z q = X q+1 .
Theorem 5.7. (i) The genus of Hq is q(q − 1)/2.
(ii) The number of Fq2 points of Hq is q 3 + 1.
(iii) The curve Hq is maximal over Fq2 .
Proof. (i) The partial derivatives of Y q Z + Y Z q = X q+1 are FX = −X q , FY = Z q
and FZ = Y q . These are simultaneously zero only at [0, 0, 0] which is not a
valid point of projective space. So Hq is a smooth projective curve of degree
d = q + 1 in P2 . By the Plücker formula, the genus of Hq is g = q(q − 1)/2.
(Another way to prove this is to consider the map φ : Hq → P1 given by
φ([X : Y : Z]) = [Y : Z]. This map has degree q + 1 unless X = 0 or Z = 0;
this means there are q + 1 points of P1 above which the map is 1-to-1. Then
the genus can be computed using the Riemann-Hurwitz formula.)
(ii) If Z = 0 then X = 0, and so [0 : 1 : 0] is the only point at infinity on Hq .
Consider the trace map Tr : Fq2 → Fq with formula Tr(y) = y q + y. It is a
surjective q-to-1 additive map. Also consider the norm map N : F∗q2 → F∗q with
formula N(x) = xq · x. It is a surjective q + 1-to-q multiplicative map.
Suppose x, y ∈ Fq2 with y q + y = xq+1 . Then Tr(y) = N(x) ∈ Fq . If N(x) =
0, then there is one choice for x, namely x = 0, and q choices for y. If
N(x) 6= 0, then there are q + 1 choices for x and q choices for y. This gives
(q − 1)(q + 1)q + q + 1 = q 3 + 1 points defined over Fq2 .
(iii) The number N of points of Hq defined over Fq2 is at most q 2 + 1 + 2gq = q 3 + 1
by the Hasse-Weil bound. Thus the upper bound is achieved.

Tsfasman, Vladut, and Zink used modular curves to find a sequence of Goppa
codes with asymptotically better parameters than any earlier known codes.

20
Problem session 5:
1. (From Walker, problems 5.15 and 6.5) Let C be the projective elliptic curve
with equation
Y 2Z + Y Z 2 = X 3 + Z 3
defined over the field F2 .

(a) Show that C is smooth and has genus 1.


(b) Find the set S of points of C defined over F4 .
(c) Show that the following functions have poles only at ∞ and find the order
of the poles at ∞:

1, X/Z, Y /Z, X 2 /Z 2 , XY /Z 2 .

(d) Find a basis of the Goppa code G(C, S, 5P∞ ).


(e) What are the invariants of this code?

21

You might also like