CC 3 LinearBlockCodes PDF
CC 3 LinearBlockCodes PDF
CC 3 LinearBlockCodes PDF
Outlines
Introduction to linear block codes Syndrome and error detection The minimum distance of a block code Error-detecting and error-correcting capabilities of a block code Standard array and syndrome decoding Probability of an undetected error for linear codes over a binary symmetric channel (BSC). Single-Parity-Check Codes, Repetition Codes, and Self-Dual Codes
2
The encoder transforms each input message u into a binary n-tuple v with n > k
This n-tuple v is referred to as the code word ( or code vector ) of the message u. There are distinct 2k code words.
4
v
Code word (n-tuple , n > k ) (2k distinct code word)
v = u 0 g 0 + u1g 1 + + u k 1g k 1
where ui = 0 or 1 for 0 i < k
(3.1)
(3.2)
(3.3)
11
If u = (1 1 0 1) is the message to be encoded, its corresponding code word, according to (3.3), would be
v = 1 g 0 + 1 g1 + 0 g 2 + 1 g 3 = (11 0 1 0 0 0) + (0 11 0 1 0 0) + (1 0 1 0 0 0 1) = (0 0 0 11 0 1)
12
A linear block code with this structure is referred to as a linear systematic block code
Redundant checking part
n - k digits
Message part
k digits
Fig. 3.1
g 0 p00 g p 1 10 g 2 p20 G= . = . . g p k 1 k 1, 0
where pij = 0 or 1
p 01 p11 p21
. . .
. . .
. . .
pk 1,1
pk 1, n k 1
14
0 0 | 0 0 1 . . . 0 | | | | 0 0 0 0 0 0 1 | 1 0 0 | 0 1 0 . . . . . .
(3.4)
It follows from (3.4) & (3.5) that the components of v are vn-k+i = ui for 0 i < k (3.6a) and vj = u0 p0j + u1 p1j + + uk-1 pk-1, j for 0 j < n-k (3.6b)
15
16
1 0 v = u G = (u0 , u1 , u2 , u3 ) 1 1
1 1 1 0
0 1 1 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
17
PT 0 0 1 0 0 1
]
. . . . . . . 0 . 0 . 0 p00 p01 p02 p10 p11 p12 . . . . . . . . . pk-1,0 pk-1,1 pk-1,2 pk-1,n-k-1
(3.7)
0 0
. 1
p0 ,n-k-1
20
p1,n-k-1
g i h j = pij + pij = 0
for 0 i < k and 0 j < n k This implies that G HT = 0
21
1 0 0 1 0 1 1 H = 0 1 0 1 1 1 0 0 0 1 0 1 1 1
23
24
The complexity of the encoding circuit is linear proportional to the block length The encoding circuit for the (7,4) code given in Table 3.1 is shown in Fig 3.3
25
26
27
ei = 1 for ri vi ei = 0 for ri = vi The n-tuple e is called the error vector (or error pattern)
29
When r is received, the decoder computes the following (n k)-tuple : s = r HT = (s0, s1, , sn-k-1) (3.10)
which is called the syndrome of r
30
31
33
0 0 1 0 0 1 1 0 1 1 1 1 0 1
35
(3.13)
The Minimum Distance of a Block Code Let v = (v0, v1, , vn-1) be a binary n-tuple, the Hamming weight (or simply weight) of v, denote by w(v), is defined as the number of nonzero components of v
For example, the Hamming weight of v = (1 0 0 0 1 1 0) is 4
Let v and w be two n-tuple, the Hamming distance between v and w, denoted d(v,w), is defined as the number of places where they differ
For example, the Hamming distance between v = (1 0 0 1 0 1 1) and w = (0 1 0 0 0 1 1) is 3
42
The Minimum Distance of a Block Code The Hamming distance is a metric function that satisfied the triangle inequality d(v, w) + d(w, x) d(v, x) (3.14)
the proof of this inequality is left as a problem
From the definition of Hamming distance and the definition of module-2 addition that the Hamming distance between two n-tuple, v and w, is equal to the Hamming weight of the sum of v and w, that is d(v, w) = w(v + w) (3.15)
For example, the Hamming distance between v = (1 0 0 1 0 1 1) and w = (1 1 1 0 0 1 0) is 4 and the weight of v + w = (0 1 1 1 0 0 1) is also 4
43
44
45
47
The Minimum Distance of a Block Code Let C be a linear block code with parity-check matrix H Corollary 3.2.1 If no d-1 or fewer columns of H add to 0, the code has minimum weight at least d Corollary 3.2.2 The minimum weight of C is equal to the smallest number of columns of H that sum to 0
48
Error-Detecting and Error-Correcting Capabilities of a Block Code If the minimum distance of a block code C is dmin, any two distinct code vector of C differ in at least dmin places A block code with minimum distance dmin is capable of detecting all the error pattern of dmin 1 or fewer errors However, it cannot detect all the error pattern of dmin errors because there exists at least one pair of code vectors that differ in dmin places and there is an error pattern of dmin errors that will carry one into the other The random-error-detecting capability of a block code with minimum distance dmin is dmin 1
50
Error-Detecting and Error-Correcting Capabilities of a Block Code An (n, k) linear code is capable of detecting 2n 2k error patterns of length n Among the 2n 1 possible nonzero error patterns, there are 2k 1 error patterns that are identical to the 2k 1 nonzero code words If any of these 2k 1 error patterns occurs, it alters the transmitted code word v into another code word w, thus w will be received and its syndrome is zero There are 2k 1 undetectable error patterns If an error pattern is not identical to a nonzero code word, the received vector r will not be a code word and the syndrome will not be zero
51
Error-Detecting and Error-Correcting Capabilities of a Block Code These 2n 2k error patterns are detectable error patterns Let Al be the number of code vectors of weight i in C, the numbers A0, A1,..., An are called the weight distribution of C Let Pu(E) denote the probability of an undetected error Since an undetected error occurs only when the error pattern is identical to a nonzero code vector of C
Pu ( E ) = Ai p i (1 p ) n i
where p is the transition probability of the BSC If the minimum distance of C is dmin, then A1 to Admin 1 are zero
52
i =1
57
Error-Detecting and Error-Correcting Capabilities of a Block Code A block code with minimum distance dmin guarantees correcting all the error patterns of t = (d min 1) / 2 or fewer errors, where (d min 1) / 2 denotes the largest integer no greater than (dmin 1)/2 The parameter t = (d min 1) / 2 is called the random-error correcting capability of the code The code is referred to as a t-error-correcting code A block code with random-error-correcting capability t is usually capable of correcting many error patterns of t + 1 or more errors For a t-error-correcting (n, k) linear code, it is capable of correcting a total 2n-k error patterns (shown in next section).
58
Standard Array and Syndrome Decoding Let v1, v2, , v2k be the code vector of C Any decoding scheme used at the receiver is a rule to partition the 2n possible received vectors into 2k disjoint subsets D1, D2, , D2k such that the code vector vi is contained in the subset Di for 1 i 2k Each subset Di is one-to-one correspondence to a code vector vi If the received vector r is found in the subset Di, r is decoded into vi Correct decoding is made if and only if the received vector r is in the subset Di that corresponds to the actual code vector transmitted
61
Standard Array and Syndrome Decoding A method to partition the 2n possible received vectors into 2k disjoint subsets such that each subset contains one and only one code vector is described here
First, the 2k code vectors of C are placed in a row with the all-zero code vector v1 = (0, 0, , 0) as the first (leftmost) element From the remaining 2n 2k n-tuple, an n-tuple e2 is chosen and is placed under the zero vector v1 Now, we form a second row by adding e2 to each code vector vi in the first row and placing the sum e2 + vi under vi An unused n-tuple e3 is chosen from the remaining n-tuples and is placed under e2. Then a third row is formed by adding e3 to each code vector vi in the first row and placing e3 + vi under vi . we continue this process until all the n-tuples are used.
62
63
Standard Array and Syndrome Decoding Then we have an array of rows and columns as shown in Fig 3.6 This array is called a standard array of the given linear code C Theorem 3.3 No two n-tuples in the same row of a standard array are identical. Every n-tuple appears in one and only one row Proof
The first part of the theorem follows from the fact that all the code vectors of C are distinct Suppose that two n-tuples in the lth rows are identical, say el + vi = el + vj with i j This means that vi = vj, which is impossible, therefore no two n-tuples in the same row are identical
64
Standard Array and Syndrome Decoding From Theorem 3.3 we see that there are 2n/2k = 2n-k disjoint rows in the standard array, and each row consists of 2k distinct elements The 2n-k rows are called the cosets of the code C The first n-tuple ej of each coset is called a coset leader Any element in a coset can be used as its coset leader
66
Standard Array and Syndrome Decoding Example 3.6 consider the (6, 3) linear code generated by the following matrix :
0 1 1 1 0 0 G= 1 0 1 0 1 0 1 1 0 0 0 1
The standard array of this code is shown in Fig. 3.7
67
68
Standard Array and Syndrome Decoding A standard array of an (n, k) linear code C consists of 2k disjoint columns Let Dj denote the jth column of the standard array, then (3.27) Dj = {vj, e2+vj, e3+vj, , e2n-k + vj}
vj is a code vector of C and e2, e3, , e2n-k are the coset leaders
The 2k disjoint columns D1, D2, , D2k can be used for decoding the code C. Suppose that the code vector vj is transmitted over a noisy channel, from (3.27) we see that the received vector r is in Dj if the error pattern caused by the channel is a coset leader If the error pattern caused by the channel is not a coset leader, an erroneous decoding will result
69
Standard Array and Syndrome Decoding The decoding is correct if and only if the error pattern caused by the channel is a coset leader The 2n-k coset leaders (including the zero vector 0) are called the correctable error patterns Theorem 3.4 Every (n, k) linear block code is capable of correcting 2n-k error pattern To minimize the probability of a decoding error, the error patterns that are most likely to occur for a given channel should be chosen as the coset leaders When a standard array is formed, each coset leader should be chosen to be a vector of least weight from the remaining available vectors
70
Standard Array and Syndrome Decoding Each coset leader has minimum weight in its coset The decoding based on the standard array is the minimum distance decoding (i.e. the maximum likelihood decoding) Let i denote the number of coset leaders of weight i, the numbers 0 , 1 ,,n are called the weight distribution of the coset leaders Since a decoding error occurs if and only if the error pattern is not a coset leader, the error probability for a BSC with transition probability p is
P(E) =1 i p i (1 p ) n i
i =0
71
72
Standard Array and Syndrome Decoding An (n, k) linear code is capable of detecting 2n 2k error patterns, it is capable of correcting only 2 nk error patterns The probability of a decoding error is much higher than the probability of an undetected error Theorem 3.5
(1) For an (n, k) linear code C with minimum distance dmin, all the n-tuples of weight of t = (d min 1) / 2 or less can be used as coset leaders of a standard array of C. (2) If all the n-tuple of weight t or less are used as coset leader, there is at least one n-tuple of weight t + 1 that cannot be used as a coset leader
73
It follows from the definition that x and y must be in the same coset and w(x) + w(y) = w(v) = dmin Suppose we choose y such that w(y) = t + 1 Since 2t + 1 dmin 2t + 2, we have w(x)=t or t+1. If x is used as a coset leader, then y cannot be a coset leader.
75
Standard Array and Syndrome Decoding Theorem 3.5 reconfirms the fact that an (n, k) linear code with minimum distance dmin is capable of correcting all the error pattern of (d min 1) / 2 or fewer errors But it is not capable of correcting all the error patterns of weight t + 1 Theorem 3.6 All the 2k n-tuples of a coset have the same syndrome. The syndrome for different cosets are different Proof
Consider the coset whose coset leader is el A vector in this coset is the sum of el and some code vector vi in C The syndrome of this vector is (el + vi)HT = elHT + viHT = elHT
76
Standard Array and Syndrome Decoding The syndrome of an n-tuple is an (nk)-tuple and there are 2n-k distinct (nk)-tuples From theorem 3.6 that there is a one-to-one correspondence between a coset and an (nk)-tuple syndrome Using this one-to-one correspondence relationship, we can form a decoding table, which is much simpler to use than a standard array The table consists of 2n-k coset leaders (the correctable error pattern) and their corresponding syndromes This table is either stored or wired in the receiver
78
Standard Array and Syndrome Decoding The decoding of a received vector consists of three steps:
Step 1. Compute the syndrome of r, r HT Step 2. Locate the coset leader el whose syndrome is equal to r HT, then el is assumed to be the error pattern caused by the channel Step 3. Decode the received vector r into the code vector v i.e., v = r + el
The decoding scheme described above is called the syndrome decoding or table-lookup decoding
79
Standard Array and Syndrome Decoding Example 3.8 Consider the (7, 4) linear code given in Table 3.1, the parity-check matrix is given in example 3.3
The code has 23 = 8 cosets There are eight correctable error patterns (including the all-zero vector) Since the minimum distance of the code is 3, it is capable of correcting all the error patterns of weight 1 or 0 All the 7-tuples of weight 1 or 0 can be used as coset leaders The number of correctable error pattern guaranteed by the minimum distance is equal to the total number of correctable error patterns
80
81
Standard Array and Syndrome Decoding Suppose that the code vector v = (1 0 0 1 0 1 1) is transmitted and r = (1 0 0 1 1 1 1) is received For decoding r, we compute the syndrome of r 1 0 0 0 1 0 0 0 1 s = (1 0 0 1 1 1 1) 1 1 0 = ( 0 1 1) 0 1 1 1 1 1 1 0 1
82
Standard Array and Syndrome Decoding From Table 3.2 we find that (0 1 1) is the syndrome of the coset leader e = (0 0 0 0 1 0 0), then r is decoded into v* = r + e = (1 0 0 1 1 1 1) + (0 0 0 0 1 0 0) = (1 0 0 1 0 1 1)
which is the actual code vector transmitted The decoding is correct since the error pattern caused by the channel is a coset leader
83
Standard Array and Syndrome Decoding Suppose that v = (0 0 0 0 0 0 0) is transmitted and r = (1 0 0 0 1 0 0) is received We see that two errors have occurred during the transmission of v The error pattern is not correctable and will cause a decoding error When r is received, the receiver computes the syndrome s = r HT = (1 1 1) From the decoding table we find that the coset leader e = (0 0 0 0 0 1 0) corresponds to the syndrome s = (1 1 1)
84
Standard Array and Syndrome Decoding r is decoded into the code vector v* = r + e = (1 0 0 0 1 0 0) + (0 0 0 0 0 1 0) = (1 0 0 0 1 1 0) Since v* is not the actual code vector transmitted, a decoding error is committed Using Table 3.2, the code is capable of correcting any single error over a block of seven digits When two or more errors occur, a decoding error will be committed
85
Standard Array and Syndrome Decoding The table-lookup decoding of an (n, k) linear code may be implemented as follows The decoding table is regarded as the truth table of n switch functions : e0 = f 0 ( s0 , s1 ,..., sn k 1 )
e1 = f1 ( s0 , s1 ,..., sn k 1 ) . . en 1 = f n 1 ( s0 , s1 ,..., sn k 1 )
where s0, s1, , sn-k-1 are the syndrome digits where e0, e1, , en-1 are the estimated error digits
86
Standard Array and Syndrome Decoding The general decoder for an (n, k) linear code based on the table-lookup scheme is shown in Fig. 3.8
87
Standard Array and Syndrome Decoding Example 3.9 Consider the (7, 4) code given in Table 3.1
The syndrome circuit for this code is shown in Fig. 3.5 The decoding table is given by Table 3.2 From this table we form the truth table (Table 3.3) The switching expression for the seven error digits are
' e0 = s0 s1' s2 ' s1' s2 e2 = s0 ' s1 s2 e4 = s0 ' ' s1 s2 e1 = s0 ' e3 = s0 s1 s2
e5 = s0 s1 s2
e6 = s0 s1' s2
89
Standard Array and Syndrome Decoding The complete circuit of the decoder is shown in Fig. 3.9
90
Probability of An Undetected Error for Linear Codes Over a BSC Let {A0, A1, , An} be the weight distribution of an (n, k) linear code C Let {B0, B1, , Bn} be the weight distribution of its dual code Cd Now we represent these two weight distribution in polynomial form as follows : A(z) = A0 + A1z + + Anzn B(z) = B0 + B1z + + Bnzn (3.31) Then A(z) and B(z) are related by the following identity : A(z) = 2 -(n-k) (1 + z)n B(1 z / 1 + z) (3.32) This identity is known as the MacWilliams identity
92
Probability of An Undetected Error for Linear Codes Over a BSC The polynomials A(z) and B(z) are called the weight enumerators for the (n, k) linear code C and its dual Cd Using the MacWilliams identity, we can compute the probability of an undetected error for an (n, k) linear code from the weight distribution of its dual. From equation 3.19:
Pu ( E ) = Ai p (1 p )
i i =1 n n
n i
p i = (1 p ) Ai ( ) 1 p i =1
93
(3.33)
Probability of An Undetected Error for Linear Codes Over a BSC Substituting z = p/(1 p) in A(z) of (3.31) and using the fact that A0 = 1, we obtain
p p A 1 = Ai i =1 1 p 1 p
n
(3.34)
Combining (3.33) and (3.34), we have the following expression for the probability of an undetected error
p Pu ( E ) = (1 p ) [ A 1] 1 p
n
(3.35)
94
Probability of An Undetected Error for Linear Codes Over a BSC From (3.35) and the MacWilliams identity of (3.32), we finally obtain the following expression for Pu(E) : (3.36) Pu(E) = 2-(n k) B(1 2p) (1 p)n where n B(1 2 p ) = Bi (1 2 p )i
i =0
Hence, there are two ways for computing the probability of an undetected error for a linear code; often one is easier than the other. If n-k is smaller than k, it is much easier to compute Pu(E) from (3.36); otherwise, it is easier to use (3.35).
95
Probability of An Undetected Error for Linear Codes Over a BSC Example 3.10 consider the (7, 4) linear code given in Table 3.1
The dual of this code is generated by its parity-check matrix
Taking the linear combinations of the row of H, we obtain the following eight vectors in the dual code (0 0 0 0 0 0 0), (1 1 0 0 1 0 1), (1 0 0 1 0 1 1), (1 0 1 1 1 0 0), (0 1 0 1 1 1 0), (0 1 1 1 0 0 1), (0 0 1 0 1 1 1), (1 1 1 0 0 1 0)
96
1 H = 0 0
0 1 0
0 0 1
1 1 0
0 1 1 1 1 0 1 1 1
Probability of An Undetected Error for Linear Codes Over a BSC Example 3.10
Thus, the weight enumerator for the dual code is B(z) = 1 + 7z4 Using (3.36), we obtain the probability of an undetected error for the (7, 4) linear code given in Table 3.1 Pu(E) = 2-3[1 + 7(1 2p)4] (1 p)7 This probability was also computed in Section 3.4 using the weight distribution of the code itself
97
Probability of An Undetected Error for Linear Codes Over a BSC For large n, k, and n k, the computation becomes practically impossible Except for some short linear codes and a few small classes of linear codes, the weight distributions for many known linear code are still unknown Consequently, it is very difficult to compute their probability of an undetected error
98
n i n i Pu ( E ) 2 p p (1 ) i =1 i = 2 ( n k ) [1 (1 p ) n ]
(nk ) n
(3.42)
Since [1 (1 p)n] 1, it is clear that Pu(E) 2(n k). There exist (n,k) linear codes with probability of an undetected error, Pu(E), upper bounded by 2-(n-k). Only a few small classes of linear codes have been proved to have Pu(E) satisfying the upper bound 2-(n-k).
99
Hamming Codes
Hamming Codes
These codes and their variations have been widely used for error control in digital communication and data storage systems For any positive integer m 3, there exists a Hamming code with the following parameters :
Code length: n = 2m 1 Number of information symbols: k = 2m m 1 Number of parity-check symbols: nk=m Error-correcting capability : t = 1( dmin = 3) The parity-check matrix H of this code consists of all the nonzero m-tuple as its columns (2m-1).
101
Hamming Codes
In systematic form, the columns of H are arranged in the following form : H = [Im Q]
where Im is an m m identity matrix The submatrix Q consists of 2m m 1 columns which are the m-tuples of weight 2 or more
The columns of Q may be arranged in any order without affecting the distance property and weight distribution of the code
102
Hamming Codes
In systematic form, the generator matrix of the code is G = [QT I2mm1]
where QT is the transpose of Q and I 2mm1 is an (2m m 1) (2m m 1) identity matrix
Since the columns of H are nonzero and distinct, no two columns add to zero Since H consists of all the nonzero m-tuples as its columns, the vector sum of any two columns, say hi and hj, must also be a column in H, say hl hi + hj + hl = 0 The minimum distance of a Hamming code is exactly 3
103
Hamming Codes
The code is capable of correcting all the error patterns with a single error or of detecting all the error patterns of two or fewer errors If we form the standard array for the Hamming code of length 2m 1
All the (2m1)-tuple of weight 1 can be used as coset leaders The number of (2m1)-tuples of weight 1 is 2m 1 Since n k = m, the code has 2m cosets The zero vector 0 and the (2m1)-tuples of weight 1 form all the coset leaders of the standard array
104
Hamming Codes
A t-error-correcting code is called a perfect code if its standard array has all the error patterns of t or fewer errors and no others as coset leader Besides the Hamming codes, the only other nontrivial binary perfect code is the (23, 12) Golay code (section 5.3) Decoding of Hamming codes can be accomplished easily with the table-lookup scheme
105
Hamming Codes
We may delete any l columns from the parity-check matrix H of a Hamming code This deletion results in an m (2m l 1) matrix H' Using H' as a parity-check matrix, we obtain a shortened Hamming code with the following parameters :
Code length: n = 2m l 1 Number of information symbols: k = 2m m l 1 Number of parity-check symbols: nk=m Minimum distance : dmin 3 If we delete columns from H properly, we may obtain a shortened Hamming code with minimum distance 4
106
Hamming Codes
For example, if we delete from the submatrix Q all the columns of even weight, we obtain an m x 2m-1 matrix. H=[Im Q] Q consists of 2m-1-m columns of odd weight. Since all columns of H have odd weight, no three columns add to zero. However, for a column hi of weight 3 in Q, there exists three columns hj, hl, and hs in Im such that hi +hj+hl+hs=0. Thus, the shortened Hamming code with H as a parity-check matrix has minimum distance exactly 4. The distance 4 shortened Hamming code can be used for correcting all error patterns of single error and simultaneously detecting all error patterns of double errors
107
Hamming Codes
When a single error occurs during the transmission of a code vector, the resultant syndrome is nonzero and it contains an odd number of 1s (e x HT corresponds to a column in H) When double errors occurs, the syndrome is nonzero, but it contains even number of 1s Decoding can be accomplished in the following manner : If the syndrome s is zero, we assume that no error occurred If s is nonzero and it contains odd number of 1s, we assume that a single error occurred. The error pattern of a single error that corresponds to s is added to the received vector for error correction If s is nonzero and it contains even number of 1s, an uncorrectable error pattern has been detected
108
Hamming Codes
The dual code of a (2m1, 2mm1) Hamming code is a (2m1,m) linear code If a Hamming code is used for error detection over a BSC, its probability of an undetected error, Pu(E), can be computed either from (3.35) and (3.43) or from (3.36) and (3.44) Computing Pu(E) from (3.36) and (3.44) is easier Combining (3.36) and (3.44), we obtain m-1 m1 -m m 2 2 Pu(E) = 2 {1+(2 1)(1 2p) } (1 p) The probability Pu(E) for Hamming codes does satisfy the upper bound 2(nk) = 2-m for p [i.e., Pu(E) 2-m]
109
The parity-check matrix of the code is H = [1 1 1] A SPC code is also called an even-parity-check code. All the error patterns of odd weight are detectable.
112