2017 IEEE International Symposium on Information Theory (ISIT)
Weight Spectrum of Quasi-Perfect Binary Codes
with Distance 4
Valentin B. Afanassiev
Alexander A. Davydov
Institute for Information Transmission Problems
(Kharkevich institute), Russian Academy of Sciences
GSP-4, Moscow, 127994, Russian Federation
[email protected]
Institute for Information Transmission Problems
(Kharkevich institute), Russian Academy of Sciences
GSP-4, Moscow, 127994, Russian Federation
[email protected]
Abstract—We consider the weight spectrum of a class of quasiperfect binary linear codes with code distance 4. For example,
extended Hamming code and Panchenko code are the known
members of this class. Also, it is known that in many cases
Panchenko code has the minimal number of weight 4 codewords.
We give exact recursive formulas for the weight spectrum of
quasi-perfect codes and their dual codes. As an example of
application of the weight spectrum we derive a lower estimate
for the conditional probability of correction of erasure patterns
of high weights (equal to or greater than code distance).
Definition 1. The doubling construction creates a parity check
matrix Hr of an [nr , nr − r, dr ] code from a parity check
matrix Hr−1 of an [nr−1 , nr−1 −(r−1), dr−1 ] code as follows
0...0
| 1...1
Hr = − − −− | − − − .
(1)
Hr−1
| Hr−1
By (1) we have nr = 2nr−1 .
Let us define matrices S and M as
1 0 0 0 1
0 1 0 0 1
0
S=
,
M=
0 0 1 0 1
1
0 0 0 1 1
I. I NTRODUCTION
Calculation or estimation of the weight spectrum of linear
code is one of very old unresolved problem that gives rise a
long list of other unresolved problems in coding theory. Binary
quasi-perfect codes has a long history in investigation but with
a “hole” in area of weight distribution for the most of the
codes. We caught a happy chance to find a “simple” solution
for weight spectrum of a whole class of binary quasi-perfect
codes.
The other and real motivation for the research was to
search most effective encoding and decoding schemes for
error correction and error detection in computer memory. The
physical volume of contemporary memory cells tends to “zero”
but the probability of error or defect in a cell tends to be very
critical for a whole memory device. As a consequence of this
trend we need more and more effective encoding schemes for
correction of independent errors and their collections in the
form of two dimensional blots.
The binary quasi-perfect extended Hamming code is traditional choice for memory devices. We suggest as a better
choice Panchenko code in original and product forms (for blot
correction). The main our improvement over the traditional
solution is the extension of the decoding area due to correction
of detected errors as erasure patterns of weights equal to or
greater than the code distance.
II. Q UASI - PERFECT CODES CREATED BY THE DOUBLING
CONSTRUCTION
Let an [n, n − r, d] be a linear binary code of length n,
redundancy r, and minimum distance d. For a code with
redundancy r we introduce also the following notations: nr
is length of the code, Hr is its parity check matrix of size
r × nr , and dr is code distance.
978-1-5090-4096-4/17/$31.00 ©2017 IEEE
1
1
.
Denote by HrEH a parity check matrix of the extended Hamming [2r−1 , 2r−1 − r, 4] code, r ≥ 3. By (1), if Hr−1 = M
EH
(resp. Hr−1 = Hr−1
) then Hr = H3EH (resp. Hr = HrEH ). If
in (1) we have dr−1 = 3 then dr = 3 since the left part of Hr
contains 3 linear dependent columns provided by the structure
of Hr−1 . Finally, let hi (resp. [0hi ]T or [1hi ]T ) be a column
of Hr−1 (resp. Hr ). If in (1) dr−1 ≥ 4 then dr = 4 as the
sum of columns [0hi ]T + [0hj ]T + [1hi ]T + [1hj ]T , i 6= j, is
equal to zero.
Definition 2. A code correcting t errors is quasi-perfect if its
covering radius is equal to t + 1.
In particular, a quasi-perfect code with distance d = 4 has
covering radius 2. Minimum distance of any code correcting t
errors is equal to 2t+1 or 2t+2. A linear quasi-perfect code is
“non-extendable” in the sense that addition of any column to
a parity check matrix decreases the code distance. Any linear
[n, n − r, 2t + 2] code with 2t + 2 ≥ 4 is either a quasi-perfect
one or shortening of some quasi-perfect code of redundancy
r and distance 2t + 2.
Theorem 3. [1] Let nr ≥ 2r−2 +2, r ≥ 3, and let an [nr , nr −
r, 4] code be quasi-perfect. Then a parity check matrix Hr of
the code can be presented in the form (1) where matrix Hr−1
is given in one of the following three variants only:
• Hr−1 is a parity check matrix of an [nr−1 , nr−1 −(r −1), 4]
quasi-perfect code with nr−1 = 12 nr ;
• Hr−1 = S;
• Hr−1 = M.
2198
2017 IEEE International Symposium on Information Theory (ISIT)
Corollary 4. [1] Let nr ≥ 2r−2 + 2, r ≥ 5, and let an
[nr , nr − r, 4] code be quasi-perfect. Then length nr can take
any value from the sequence
nr = 2r−2 + 2r−2−g for g = 0, 2, 3, 4, 5, . . . , r − 3.
(2)
Moreover, for each g = 0, 2, 3, 4, 5, . . . , r − 3, there exists an
[nr , nr − r, 4] quasi-perfect code with nr = 2r−2 + 2r−2−g .
Also, nr may not take any other value that is not noted in (2).
Now we give a general description of a parity check matrix
for whole class of quasi-perfect codes with distance 4. Let
Bk,g = [bk . . . bk ] ,
g ∈ {0, 2, 3, 4, 5, . . . , r − 3},
be the (r − g − 2) × (2g + 1) matrix of identical columns
bk , where r ≥ 5 is code redundancy, bk is the binary
representation of the integer k (with the most significant bit
at the top position).
Corollary 5. [1] Let nr = 2r−2 + 2r−2−g , r ≥ 5, g ∈
{0, 2, 3, 4, 5, . . . , r − 3}, and let an [nr , nr − r, 4] code be
quasi-perfect. Then a parity check matrix Hr of the code can
be presented in the form
B0,g
Hr = − − −
Hg+2
|
|
|
B1,g
−−−
Hg+2
|
|
|
...
|
|
|
BD,g
− − − ,
Hg+2
(3)
where D = 2r−g−2 − 1, H2 = M , H4 = S, Hg+2 is a parity
check matrix of a quasi-perfect [2g + 1, 2g + 1 − (g + 2), 4]
code if g ≥ 3.
Remark 6. By Corollary 5 a parity check matrix of any quasiperfect binary code with length 2r−2 +2r−2−g and redundancy
r can be created by (r − g − 2)-fold applying of the doubling
construction.
As it is noted above, an arbitrary [n, n − r, 4] code is
either a quasi-perfect code or shortening of some quasi-perfect
code with d = 4 and redundancy r. Therefore Theorem 3,
Corollaries 4, 5, and Remark 6, in fact, describe all binary
linear codes with d = 4 and length ≥ 2r−2 +2. It is why weight
spectrum of codes obtained by the doubling construction (1)
is an important problem.
The class of codes, say D, obtained by the doubling construction is sufficiently wide. By (1), the [2r − 1, 2r − 1 − r, 3]
Hamming code and many its shortenings are included in D.
It follows from Theorem 3 that [2r−1 , 2r−1 − r, 4] extended
Hamming code and Panchenko code Πr (see below) belong to
D. Other numerous non-equivalent codes of D can be obtained
by multiple application of the doubling construction to distinct
quasi-perfect [2g + 1, 2g + 1 − (g + 2), 4] codes C0 with
g ∈ {0, 2, 3, 4, 5, . . . , r − 3}, see (3). Examples of codes C0
can be found in [1]–[3] in algebraic and in geometrical forms.
For instance, we give a parity
[9, 9 − 5, 4] code.
00000
10001
01001
00101
00011
check matrix of a quasi-perfect
|
|
|
|
|
1111
0000
1001
0101
0011
.
The quasi-perfect codes Πr were proposed by V.I. Panchenko in paper [4]. The [n, n − r, 4] code Πr has length
n = 5 · 2r−4 , redundancy r ≥ 5, and code distance d = 4. (In
paper [5] the code Πr is denoted as Π.)
The parity check r × 5 · 2r−4 matrix Pr of Panchenko code
Πr is the matrix Hr of (3) with g = 2, D = 2r−4 − 1, and
Hg+2 = S. So,
B0,2 B1,2 B2,2 . . . BD,2
.
(4)
Pr =
S
S
S
...
S
Remind the known [4]–[6] and important properties of
Panchenko code and its shortenings:
• For all r and n, there exist a shortened Panchenko code
in which the number of weight 4 codewords is close to the
theoretical lower bound.
• Independently of shortening algorithm, for all r and n,
the number of weight 4 codewords in Panchenko code and its
shortenings is smaller than in a shortened extended Hamming
code.
• For r = 7, n ∈ {32, 33, . . . , 40}, and r = 8, n ∈ {72, 73,
. . . , 80}, Panchenko code and its shortenings by a special
algorithm have the minimal number of weight 4 codewords
among all other codes of the same length and redundancy.
As the consequence of this property, Panchenko code has a
small (often the minimal) probability of undetected error since
this probability is essentially defined by the number of weight
4 codewords. In particular, it is important for error correction
in computer memory [5], [6].
III. W EIGHT SPECTRUM OF CODES CREATED BY THE
DOUBLING CONSTRUCTION
By Section II, a parity check matrix of any quasi-perfect
binary code with d = 4 can be created by multiple application
of the doubling construction. Therefore, Theorems 7 and 8
allow us to obtain weight spectrum of such code (and its dual)
starting from weight spectrum of a short code.
We use notations introduced in the previous section. Also,
(r)
for a code with redundancy r we denote by Aw the number of
(r)⊥
codewords of weight w and by Aw the number of codewords
of weight w in the dual code.
Theorem 7. Let dr ≤ 4. Assume that an [nr , nr − r, dr ]
code Cr is created from a [ 12 nr , 21 nr − r + 1, dr−1 ] code
Cr−1 by the doubling construction (1). Then weight spectrum
(r)
{Aw , dr ≤ w ≤ nr } of Cr can be obtained from weight
(r−1)
spectrum {Aw
, dr−1 ≤ w ≤ 12 nr } of Cr−1 as follows:
1
v−2
X
nr − 2v + 2j
(r)
(r)
2v−2j−1 (r−1)
2
(5)
A2v = ∆v +
2
A2v−2j
j
j=0
2199
2017 IEEE International Symposium on Information Theory (ISIT)
where
∆(r)
v
(r)
A2v+1 =
v−2
X
=
0
1
2 nr
v
(r−1)
22v−2j A2v+1−2j
j=0
if v
if v
1
2 nr
odd
;
even
− 2v − 1 + 2j
.
j
(6)
Proof. We consider structures of weight w codewords and the
structures of the corresponding sets of w columns of a parity
check matrix.
Let u ∈ {r, r − 1}. Let cw,u be a weight w codeword of
the code Cu . Denote by Hr (cw,u ) the set of w columns of the
matrix Hu corresponding to the codeword cw,u . By definition,
the sum of all columns of Hu (cw,u ) is equal to zero.
We describe column sets of Hr in (1) with the help of
column sets of Hr−1 placed in the left and right sides of (1).
(i) Let us consider all possible structures of codewords
c2v,r of even weight 2v and the corresponding column sets
Hr (c2v,r ) in the matrix Hr of (1). Every such column set
consists of the following components:
• A column set Hr−1 (c2v−2j,r−1 ) partitioned by two parts
that are placed in the left and right sides of Hr .
• Two sets of the same j columns of Hr−1 placed in the left
and right sides of Hr . (These column sets are not connected
with any codewords of Cr−1 .)
For j = 0, 1, . . . , v − 2 and for every codeword c2v−2j,r−1
of even weight, we explain summands of the formula (5).
v−2
P 2v−2j−1 (r−1) 1 nr −2v+2j
of (5).
– The summand
2
A2v−2j 2
j
j=0
A column set Γ = Hr−1 (c2v−2j,r−1 ) is partitioned by two
parts. Every part contains an odd (resp. even) number of
columns if j is odd (resp. even). The partition is executed
by all possible ways. The number of the partitions is equal to
22v−2j−1 . The obtained parts are placed in the left and right
sides of Hr .
Also, in every of two submatrices Hr−1 of (1) we take the
same set of j columns that do not belong
to Γ. The number of
1
such j-sets is equal to 2 nr −2v+2j
.
As
a
result, in the right
j
side of Hr we always take an even
number
of columns.
1
(r)
nr
2
– The summand ∆v = v of (5).
If v is even then in every of two submatrices Hr−1 of (1) we
take the same
set of v columns. The number of variants is
1
equal to 2 vnr .
(ii) Let us consider all possible structures of codewords
c2v+1,r of odd weight 2v + 1 and the corresponding column
sets Hr (c2v+1,r ) in the matrix Hr of (1). Every such column
set consists of the following components:
• A column set Hr−1 (c2v+1−2j,r−1 ) partitioned by two
parts that are placed in the left and right sides of Hr .
• Two sets of the same j columns of Hr−1 placed in the left
and right sides of Hr . (These column sets are not connected
with any codewords of Cr−1 .)
For j = 0, 1, . . . , v−2 and for every codeword c2v+1−2j,r−1
of odd weight, we explain the formula (6).
A column set Γ = Hr−1 (c2v+1−2j,r−1 ) is partitioned by
two parts. One part, say Aodd , contains an odd number of
columns, another part, say Beven , contains an even number of
columns. The partition is executed by all possible ways. The
number of the partitions is equal to 22v−2j .
If j is odd then the part Beven (resp. Aodd ) is placed in the
left (resp. right) side of Hr .
If j is even or j = 0 then the part Aodd (resp. Beven ) is
placed in the left (resp. right) side of Hr .
Also, in every of two submatrices Hr−1 of (1) we take the
same set of j columns that do no belong
to Γ. The number of
1
. As a result, in the right
such j-sets is equal to 2 nr −2v−1+2j
j
side of Hr we always take an even number of columns.
Now we give the weight spectrum for duals to quasi-perfect
codes.
Theorem 8. Let dr ≤ 4. Assume that an [nr , nr − r, dr ] code
Cr is created from a [ 12 nr , 21 nr −r +1, dr−1 ] code Cr−1 by the
(r)⊥
doubling construction (1). Then weight spectrum {Aw , w ≤
⊥
nr } of the [nr , r, dr ] code dual to Cr can be obtained from
(r−1)⊥
weight spectrum {Aw
, w ≤ 12 nr } of the [ 21 nr , r−1, d⊥
r−1 ]
code dual to Cr−1 as follows:
0
if 2v 6= 12 nr
(r)⊥
.
(7)
A2v = Av(r−1)⊥ +
r−1
2
if 2v = 21 nr
Proof. We consider matrix (1) as a generator matrix of the
dual code. If codeword of the dual code is created without
inclusion the top row, then its weight is equal to the doubled
weight of the corresponding word formed from rows of matrix
Hr−1 . If the top row is included into codeword, its weight is
equal to 12 nr .
IV. O N CORRECTION OF ERASURE PATTERNS OF HIGH
WEIGHT
Knowledge of the weight spectrum of a code opens a way
for calculation of very important probabilities for the code, like
conditional probability of correct decoding of erasure patterns,
probability of undetected error and so on. In binary codes, the
number of parity check bits is larger than code distance. That
is a good reason to investigate a total ability of codes to correct
erasure patterns of high weights (equal to or greater than code
distance).
The necessary condition for correction of weight ρ erasure
patterns is the full rank of submatrix, consisting of columns of
a code parity check matrix, corresponding to erased positions.
Let Sρ be the number of erasure patterns of weight ρ,
which can be corrected by a code (equivalently, for a code
parity check matrix, Sρ is the number of distinct sets of ρ
linear independent columns or the number of distinct r × ρ
submatrices of the full rank).
S
For a code of length n, let δρ = nρ be the conditional
(ρ)
probability of correct decoding of erasure patterns of weight ρ.
In further, for [n, n − r, d] code with weight spectrum
A0 , A1 , . . . , An we introduce the function
X
ρ
n
n−w
Ψ(n, d, ρ) =
−
Aw
, d ≤ ρ ≤ r. (8)
ρ
ρ−w
2200
w=d
2017 IEEE International Symposium on Information Theory (ISIT)
This function gives a lower estimate of Sρ , see [7], [8].
We give a recursive form of function of type (8) :
X
ρ
n
Aw (n)Ψ̃(n − w, d, ρ − w),
−
Ψ̃(n, d, ρ) =
ρ
Theorem 9. For an [n, n−r, d] code, the conditional probability δρ and the value Sρ satisfy the following lower estimates:
w=d
Ψ(n, d, ρ)
,
δρ ≥
n
Sρ ≥ Ψ(n, d, ρ),
d ≤ ρ ≤ r.
(9)
ρ
In particular, the following equalities
δρ =
Ψ(n, d, ρ)
,
n
Sρ = Ψ(n, d, ρ),
where Aw (n) is the number of weight w words in a (shortened)
code of length n.
A recursive estimate of the conditional probability of correct
decoding of erasure patterns of weight ρ and the first and
second steps of the recursion has the form, respectively,
ρ
δ̃ (n, d, ρ) =
hold under the condition
Ψ̃(n, d, ρ)
n
ρ
d−1
ρ≤d+
.
2
=1−
ρ
X
Aw (n)δ̃(n − w, d, ρ − w)
w=d
The proof of Theorem 9 is based on the fact that the value
Sρ is equal to the difference between the total number of sets
of ρ columns of a parity check matrix and the number of
patterns of ρ linear dependent columns.
Now we use the known binomial approximation of weight
n
,
spectrum of a binary linear code [9]–[11] Aw ≈ 2−z w
r − 1 < z ≤ r, w ≥ d, where z is a real value taking
into account (in principle) correction terms in the mentioned
approximations and the weight region w ≥ d. We obtain the
following approximation of the function Sρ for the region
d ≤ ρ ≤ r.
X
ρ
n−w
n
Aw
Sρ ≥
−
ρ−w
ρ
w=d
ρ
X n
n−w
n
− 2−z
≈
ρ−w
w
ρ
w=d
X
ρ
ρ
n
n
.
− 2−z
=
w
ρ
ρ
w=d
From here, using [11, Lemma 10.8], we obtain an estimate of
the conditional probability δρ of correct decoding of erasure
patterns of high weight ρ.
ρ
X
ρ
Sρ
δρ ≥ n ≈ 1 − 2−z
w
ρ
w=d
≈ 1 − 2−z · 2ρH(d/ρ) ≥ 1 − 2ρ−z , d ≤ ρ < z,
where H(d/ρ) is the binary entropy.
The proposed estimate shows that for a fixed r, the probability δρ decreases exponentially with growth of ρ. Therefore
the reasonable extended region of correctable erasure patterns
is ρ < 2d.
The following lemma allows us to improve estimates of
Theorem 9 using a recursive approach.
Lemma 10. Any set of ρ linear dependent columns of a parity
check matrix is an union of w columns with the zero sum
(corresponding to a weight w codeword ) and a set of ρ − w
linear independent columns, where d ≤ w ≤ ρ.
δ˜2 (n, d, ρ) = 1 −
ρ
X
Aw1 (n)
w1 =d
"
× 1−
ρ−w
X1
w2 =d
Aw2 (n − w1 )
n−w1
ρ−w1
n
ρ
n−w1 −w2
ρ−w1 −w2
n−w1
ρ−w1
n−w
ρ−w
n
ρ
;
×
#
.
V. A PPLICATION TO MEMORY
An important area for application of quasi-perfect codes
is computer memory (Flash or SSD). Their ability to correct
a big number of erasures instead of one error and very low
probability of undetected error gives us a strong incentive to
investigate the conditional probability of correct decoding for
erasure patterns of high weight. As an example, useful for
application, we give two tables: the first one for conditional
probability of correct decoding for erasure patterns of weights
higher the code distance and the second one for the probability
(unconditional) of decoding failure in memory channel with
different error probability for the product of Panchenko codes.
Decoding algorithm for product of Panchenko codes consists of following steps.
1) Error detection in rows and columns of the received
word (in parallel).
2) Check (in parallel) of the detected row (column) list for
correctability as erasure pattern.
3) Correction of the chosen erasure pattern (row or column) and output.
Check for correctability is executed in extended area up to
d+ erasures.
Table I gives a comparison between Hamming and Panchenko codes with 7 and 8 parity symbols. We can see from
the table that extended decoding with correction of 4, 5,
6, 7 erasures has decreasing probability from 1 up to 1/2
(approximately).
Table II demonstrates fast decreasing of the probability of
decoding failure for fixed number of parity bits with extension
of the decoding area for product of two Panchenko codes. We
can see from the second table fast decreasing of the failure
probability with extension of the decoding area from 3 up to
6 erasures.
2201
2017 IEEE International Symposium on Information Theory (ISIT)
TABLE I
C ONDITIONAL PROBABILITY δρ OF CORRECT DECODING OF ERASURE
PATTERNS OF WEIGHT ρ FOR H AMMING AND PANCHENKO CODES
code
Hamming
Panchenko
Hamming
Panchenko
r
7
7
8
8
ρ=d=4
0.9836
0.9870
0.9920
0.9934
ρ=5
0.9180
0.9287
0.9600
0.9647
ρ=6
0.7469
0.7656
0.8741
0.8830
ρ=7
0.4121
0.4306
0.6879
0.6996
TABLE II
FAILURE PROBABILITY FOR PRODUCT OF PANCHENKO CODES [72, 64, 4]
d+
d+
d+
d+
p
=3
=4
=5
=6
10−1
1
1
1
1
10−2
0,996
0,988
0,967
0,926
5 · 10−3
0,250
0,092
0,027
0,008
10−3
1,1e-09
1,6e-12
7,0e-14
5,8e-14
5 · 10−4
2,3e-14
5,1e-18
1,045e-18
1,029e-18
ACKNOWLEDGMENT
The research was carried out at the IITP RAS at the expense
of the Russian Foundation for Sciences (project 14-50-00150).
R EFERENCES
[1] A. Davydov and L. Tombak, “Quasiperfect linear binary codes with
minimal distance 4 and complete caps in projective geometry,” Problems
of Information Transmission, vol. 25, no. 4, pp. 265–275, Dec. 1989.
[2] A. Bruen and D. Wehlau, “Long binary linear codes and large caps in
projective space,” Designs, Codes and Cryptography, vol. 17, no. 1, pp.
37–60, Dec. 1999.
[3] D. Wehlau, “Complete caps in projective space which are disjoint from
a codimension 2 subspace,” in Finite Geometries, ser. Developments in
Mathematics, A. Blokhuis, J. Hirschfeld, D. Jungnickel, and J. Thas,
Eds. Dordrecht: Kluwer Academic Publishers, 2001, vol. 3, pp. 347–
361, corrected version: https://arxiv.org/abs/math/0403031.
[4] V. Panchenko, “On optimization of linear code with distance 4,” in
Proceedings 8th All-Union Conference on Coding Theory and Communications, Part 2: Coding Theory, Kuibyshev (Moscow), USSR, 1981,
pp. 132–134, (in Russian).
[5] A. Davydov and L. Tombak, “An alternative to the Hamming code in
the class of SEC-DED codes in semiconductor memory,” IEEE Trans.
Inf. Theory, vol. IT-37, no. 3, pp. 897–902, May 1991.
[6] V. Afanassiev, A. Davydov, and D. Zigangirov, “Design and analysis
of codes with distance 4 and 6 minimizing the probability of decoder
error,” Journal of Communications Technology and Electronics, vol. 61,
no. 12, pp. 1440–1455, Dec. 2016.
[7] ——, “Estimation of the conditional probability of correct decoding of
erasure patterns for linear codes,” Informatsionnye Protsessy, vol. 16,
no. 4, pp. 382–404, Dec. 2016, (in Russian).
[8] O. Popov, “On estimate of ability of linear codes to correct erasures
and to detect errors when erasures are,” Electrosvyaz’, vol. 10, 1967, (in
Russian).
[9] K. Cheung, “The weight distribution and randomness of linear codes,”
Jet Propulsion Lab., California Institute of Technology, Pasadena,
CA, USA, TDA Progress Report 42-97, 1989. [Online]. Available:
https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19890018521.pdf
[10] I. Krasikov and S. Litsyn, “On spectra of bch codes,” IEEE Trans. Inf.
Theory, vol. IT-41, no. 3, pp. 786–788, May 1995.
[11] F. MacWilliams and N. Sloane, The Theory Error-Correcting Codes.
Amsterdam, New-York: North-Holland Publushing Company, 1977.
2202