1
Multi-Kernel Polar Codes:
Concept and Design Principles
arXiv:2001.04670v1 [cs.IT] 14 Jan 2020
Valerio Bioglio, Frédéric Gabry, Ingmar Land, Jean-Claude Belfiore
Mathematical and Algorithmic Sciences Lab
France Research Center, Huawei Technologies France SASU
Email: {valerio.bioglio,frederic.gabry,ingmar.land,jean.claude.belfiore}@huawei.com
Abstract—In this paper, we propose a new polar code construction by employing kernels of different sizes in the Kronecker
product of the transformation matrix, thus generalizing the
original construction by Arikan. The proposed multi-kernel
polar code allows for more flexibility in terms of the code
length, moreover allowing for various new design principles. We
describe in detail encoding as well as successive cancellation
(SC) decoding and SC list (SCL) decoding, and we provide a
novel design method for the frozen set that allows to optimise
the performance under list decoding, as opposed to original
relability-based code design. Finally, we numerically demonstrate
the advantage of multi-kernel polar codes under the new design
principles compared to punctured and shortened polar codes.
u0
u1
u2
u3
u4
Polar codes are a family of error-correcting codes recently
introduced by Arikan in [1] as the first codes able to provably
achieve channel capacity for a large number of channels. The
construction of a polar code of length N = 2n is based on
the recursive concatenation of the binary matrix T2 = 11 01 ,
referred to as the kernel of the transformation. This operation
results in a transformation matrix TN = T2⊗n , given by the nfold Kronecker power of the kernel matrix T2 , converting the
physical channel into N virtual synthetic channels characterized by either very high or very low reliability. This channel
polarization effect leads to a portion of fully reliable channels
that tends to the channel capacity for symmetric binary-input
memoryless channels, when the code length tends to infinity.
In the asymptotic case, successive cancellation (SC) decoding
is sufficient to achieve the channel capacity [1]. In the finitelength regime, successive cancellation list (SCL) decoding [2]
leads to performance competitive with many other classes of
channel codes, like LDPC codes, particularly when an outer
CRC code or generalizations thereof are applied [3]. Due to
their excellent performance, polar codes were recently adopted
as moderate-length codes for 5G [4].
As conjectured in [1], the polarization phenomenon obtained
by the Kronecker powers of T2 can be extended to other
kernels. In [5], authors show necessary and sufficient conditions for kernels to polarize, allowing researchers to propose
larger binary kernels as bases of novel polar codes [6]. Nonbinary kernels have been proposed to improve the asymptotic
error probability [7], [8], while mixing binary and non-binary
kernels showed additional improvement over homogeneous
kernels constructions [9]. As a result, current constructions
restrict the length of polar codes to be in the form N = ln .
T3
u5
u6
u7
T3
u8
u9
u10
I. I NTRODUCTION
T3
T3
u11
Stage 3
π3
T2
T2
T2
T2
T2
T2
T2
T2
T2
T2
T2
T2
Stage 2
π2
Stage 1
x0
x1
x2
x3
x4
x5
x6
x7
x8
x9
x10
x11
π1
Fig. 1: Tanner graph of the multi-kernel polar code of length
N = 12 for the transformation matrix T12 = T2 ⊗ T2 ⊗ T3 .
This code length constraint can be a huge limitation to
practical use of polar codes, since only few block lengths can
be expressed as a power of an integer. Punctured [10] and
shortened [11] polar codes have been proposed to increase the
number of achievable block lengths. Even if these techniques
offer a practical way to construct codes of arbitrary lengths,
they show many disadvantages. In fact, punctured and shortened codes are decoded by means of their mother polar codes,
increasing the decoding latency with respect to the actual
code length. Moreover, the location of dummy bits, altering
the polarization of the codes, has to be carefully chosen to
avoid catastrophic error-rate performance [12]. Finally, the
lack of structure between the frozen sets and the puncturing
or shortening patterns complicates the code design [13].
In this paper, we present multi-kernel polar codes, which
generalize polar codes by mixing kernels of different sizes
over the same binary alphabet. These codes were theorized in
[14], where their polarization rate is calculated algebrically,
and explicitaly constructed in [15]; they conceptually permit
to construct polar codes of any block length while keeping
the polarization effect [16]. The encoding follows the general
structure of polar codes, and the decoding can be performed
by successive cancellation as well. Building on our previous
contributions in [15] and [17], in this paper we present a thorough description and analysis of construction of multi-kernel
polar codes, discussing the issues related to the choice of
2
component kernels. Finally, similarly to [18] we combine the
aforementioned designs into a novel hybrid design exhibiting
good error correction performance at moderate length.
This paper is organized as follows. In Section II, we present
the general construction, including the encoding and decoding,
of multi-kernel polar codes. In Section III we provide recommendations for the selection of kernels to be used in the
multi-kernel construction. In Section IV we describe explicitly
the different designs for multi-kernel polar codes, namely by
reliability, by minimum distance, and by a hybrid criterion. In
Section V we discuss the performance of the proposed codes,
and Section VI concludes this paper.
II. C ODE AND D ECODER
In this section, we introduce the structure, encoding and
decoding of multi-kernel polar codes. As an example, the
Tanner graph of a code of length N = 12 is depicted in Fig. 1,
comprising two kernels of size 2 and one kernel of size 3.
A. Code Structure and Encoding
Multi-kernel polar codes are a generalization of Arikan’s
polar codes obtained by using binary kernels of different sizes
in the construction of the transformation matrix of the code.
An (N, K) multi-kernel polar code of length N and dimension
K is defined by a N × N transformation matrix
TN = Tp 1 ⊗ Tp 2 ⊗ . . . ⊗ Tp s ,
(1)
with N = p1 · p2 · . . . · ps , and a frozen set F ⊂ [N ], where
[N ] = {0, 1, 2, . . . , N − 1}, such that |F| = N − K. The
information set is defined as I = F C . Building blocks of
the code are the pi × pi matrices Tpi with binary entries,
which define kernels of dimension pi [19]. A list of binary
polarizing kernels with maximum exponents, i.e. maximum
polarization, can be found in [7]. However, other kernels may
be advantageous for the design of multi-kernel polar codes,
as described in Sec. III. The frozen set collects the indices of
the input vector to be frozen: its design will be discussed in
Section IV. Codewords x ∈ F2 N are generated from the input
vector u ∈ F2 N by x = u·TN , where uj = 0 for j ∈ F and the
remaining K entries of ui for i ∈ I store the information to
be transmitted. Note that outer CRCs or similar parity checks
may be inserted as for original polar codes. In the following,
we will refer to polar codes when the transformation matrix
is generated using a single kernel according to the original
formulation by Arikan, while we will refer to multi-kernel
polar codes if more than one binary kernel is used in the
transformation matrix generation.
The order of the kernels in (1) is important for the design
of the code, as this operation is not commutative. Changing
the order of kernels in TN is equivalent to permuting its rows
and columns, since for any Kronecker product there exist two
permutation matrices P, Q such that A ⊗ B = P · (B ⊗ A) ·
Q [20]. In practice, changing the order of the kernels leads
to a transformation matrix in which rows and columns are
permuted as compared to the original transformation matrix.
Every frozen set imposed on the original matrix can hence
be mapped in a frozen set of the permuted matrix: all the
kernel orders lead to equivalent codes. However, this order
may have an effect on the polarization of the virtual channels
as discussed in Section IV-A.
B. Tanner Graph
The structure of multi-kernel polar codes can be illustrated
by the Tanner graph as depicted in Figure 1. This graph
describes the transformation matrix TN of the code, and
consists of various pi × pi boxes, each corresponding to a
kernel Tpi which defines the relation between the input vector
and the output vector. A pi × pi box has pi inputs and outputs.
A stage of the graph corresponds to a factor in the Kronecker
product of the transformation matrix, and is depicted by N/pi
boxes vertically distributed, for a total of s stages, counted
from the codeword x (right) to the input vector u (left).
The connections between stages are implicitly defined
by the Kronecker product. Stage i has N/pi boxes, each
one representing a kernel Tpi , that are connected to the
N/pi−1 boxes of stage i − 1 through an edge permutation πi .
These permutations operate in blocks, where two boxes
Qi−1 in
different blocks are not connected. Denoting Ni = j=1 pj
the partial product of the kernel sizes up to stage i, with
N1 = 1, we can divide the boxes forming stages i and
i − 1 in N/Ni+1 blocks. Inside a block, boxes of stages
i − 1 are further divided into Ni sub-blocks, so that the
j-th box of stage i is connected to the j-th output of each
sub-block of stage i − 1. This canonical permutation ρi ,
depicted in Equation (2), will be used as a basis to create
the general permutations between stages. In fact, given the
canonical permutation ρi , the permutation πi is given by πi =
(ρi | ρi + Ni+1 | ρi + 2Ni+1 | . . . | ρi + (N/Ni+1 − 1)Ni+1 )
for i = 2, . . . , s. Note that for the last stage, we have πs = ρs .
First permutation π1 , acting like the bit-reversal permutation
for polar codes, is an exception obtained inverting the product
of the other permutations as π1 = (π2 · . . . · πs )−1 .
C. Decoding
Decoding of multi-kernel polar codes is performed by
successive cancellation (SC) [1] on the Tanner graph of the
code. Similar to polar codes, enhanced SC-based decoding
methods, like simplified SC (SSC) [21], SC list (SCL) [2] or
SC stack decoding [22] may be employed. In SC, bits are
decoded sequentially using the log-likelihood ratios (LLRs)
of the received symbols along with the (estimated) previously
decoded bits. If λi corresponds to the LLR of input bit ui , an
SC decoder sequentially evaluates
λi = fiN (l0 , l1 , . . . , lN −1 , û0 , û1 , . . . , ûi−1 )
(3)
for every i from 0 to N − 1, where li corresponds to the
LLR of the codebit xi . In the following, we assume a BPSK
transmission over an AWGN channel, referred to as the BIAWGN, denoting with Es the energy of the transmitted
symbols and with N0 the single-sided noise power density.
With constellation points ±1, the SNR may be given in Es /N0
or in Eb /N0 = 1/R · Es /N0 , where R denotes the code rate
and σ 2 = N0 /(2Es ) is the variance of the AWGN. With
3
ρi =
1
1
2
pi + 1
...
...
Ni
(Ni − 1)pi + 1
Ni + 1
2
y = [y0 , . . . , yN −1 ] denoting the output of the BI-AWGN
channel, the channel LLRs are computed as li = 2yi /σ 2 .
The recursive structure of the transformation matrix of
multi-kernel polar codes, like polar codes, permits to drastically reduce the decoding complexity by performing the LLR
computation on a kernel base. In fact, LLRs can be calculated
in the kernel boxes and passed along the Tanner graph of the
code to the other kernel boxes from the right to the left, with
hard decisions on decoded bits flowing from left to right. These
hard decisions, representing the estimates of the intermediate
bits, are used by kernel boxes to calculate intermediate LLRs.
û0 , λ0
û1 , λ1
ûp−1 , λp−1
..
.
Tp
..
.
x 0 , L0
x 1 , L1
xp−1 , Lp−1
Fig. 2: A p × p box corresponding to kernel Tp .
The p × p box corresponding to a Tp kernel is depicted
in Fig. 2, having u = [u0 , u1 , . . . , up−1 ] as input vector and
x = [x0 , x1 , . . . , xp−1 ] as output vector. Hard decisions on the
output vector are calculated via multiplication with the kernel
matrix, as x̂ = û · Tp . The LLR calculation is more complex;
denoting by Li the input LLRs and by λi the output LLRs
(seen right to left), the SC equation (3) can be simplified as
λi = fip (L0 , L1 , . . . , Lp−1 , û0 , û1 , . . . , ûi−1 ),
(4)
taking into account only LLRs and bit estimates belonging to
the present box. The formulation of the LLR update functions
fip for specific kernels will be discussed in Sec. III.
III. K ERNEL A NALYSIS
The polarization phenomenon, originally proved for matrix
T2 , has been extended to binary matrices in [5] and to arbitrary
finite fields in [19], where sufficient and necessary conditions
are provided for a square matrix to polarize. A polarizing
matrix is called kernel, and can be used in the construction
of polar codes. Multi-kernel polar codes are based on the
generalized polarization effect obtained mixing kernels of
different sizes [16]. Compared to polar codes, our construction
permits to exploit the distance properties of the kernels to improve the error correction performance, in particular for small
block lengths. Here we analyze the structure and the decoding
complexity of large kernels, providing recommendations for
the design of kernels for multi-kernel polar codes.
A. Minimum-Distance Spectrum
The speed of polarization of a kernel is evaluated via
the polarization exponent [7], calculated through the partial
distances of the kernel matrix. These partial distances are
defined as the minimum weights of a sequence given by
Ni + 2
pi + 2
...
...
(ni − 1)Ni + 1
pi
...
...
Ni+1
Ni+1
(2)
the sum of a row and any linear combination of following
rows. This notion is based on the nature of SC decoding,
and is used to drive the kernel design. Under SCL decoding,
however, polar codes show better performance than predicted
by the polarization exponent. Moreover, the polarization effect
is less important than distance properties for short codes, and
kernels should be designed taking this aspect into account. We
conjecture the notion of minimum-distance spectrum [17] to
be more effective in these scenarios.
The minimum-distance spectrum STp of a kernel Tp is
defined as the mapping from dimension k, k = 1, 2, . . . , p,
to the largest minimum distance achievable by any (p, k) subcode of Tp , i.e. by any code having k rows of Tp as generator
(R)
matrix. More formally, if Tp
is the matrix formed by the
rows of Tp indexed by R ⊂ [p] and d(A) is the minimum
distance of the code generated by the rows of matrix A, then
the minimum-distance spectrum of Tp is defined as
(R)
STp (k) = max d TP
,
(5)
|R|=k
for k = 1, . . . , p. Optimal row set Rkp ⊂ [p] collects the indices
of the k rows of Tp forming the generator matrix of the optimal
(p, k, STp (k)) code extracted from the kernel.
The minimum-distance spectrum can be seen as a generalization of partial distances. In fact, partial distances are
obtained as the distances determined by selecting the rows
bottom up. As a result, row sets are nested, every row set being
a subset of all larger ones. The minimum-distance spectrum
relaxes this constraint, allowing for non-nested information
sets, and thus providing a new degree of freedom for the
overall code design. While partial distances are conceived for
characterizing SC decoding, the minimum-distance spectrum
seems to be more effective in portraying SCL decoding.
B. Kernel Decoding
Formulation of decoding equations (4) for the LLR calculation under SC decoding is of capital importance during
kernel design. With reference to Fig. 2, the (input) LLRs Li
derive from previous decoding steps or, if the kernel is in the
first stage, are the LLRs of bits xj , Lj = L(xj ), calculated
from the received symbols; the (output) LLRs λi represent
the LLRs of bits ui , λi = L(ui ). They can be expressed using
only input LLRs and hard decisions of the previously decoded
bits u0 , . . . , ui−1 , as
P
P
p−1
(1
−
x
)L
(i) exp
t
t
t=0
x∈X0
P
,
(6)
λi = ln P
p−1
(1
−
x
)L
(i) exp
t
t
t=0
x∈X
1
(i)
Xa
= {v · G : v = [u0 , . . . , ui−1 , a, vi+1 , . . . , vp−1 ], vj ∈
F2 }, a = 0, 1, which corresponds to the marginalization over
(i)
the unknown bits vi+1 , . . . , vp−1 [23]. Since |Xa | = 2p−i , to
compute this expression is in general exponential in the kernel
4
size p, even if it can be simplified by human inspection [24]
or trellis-based decoding of block codes [25]. However, this
formulation complicates the calculation of input-bit reliability,
restricting the design of the code to Monte-Carlo methods.
In practice, analysis of the tree of recurrent relations of the
graph inducted by the kernel matrix may permit to discriminate
among the different channel observations, rewriting expression
(6) using only basic operations of the LLR algebra [26]; if bits
x1 , . . . , xj represent a repetition of bit x0 and the corresponding LLRs Li are based on independent observations, then
L(x0 |L0 , . . . , Lj ) = L0 + . . . + Lj .
(7)
On the other hand, if x0 , . . . , xj are independent bits and the
LLRs Li are based on independent observations, then
L(x0 ⊕ . . . ⊕ xj ) = L0 ⊞ . . . ⊞ Lj
(8)
where the ⊞ operation is defined as
j
j
Y
Y
X
at
sgn at .
≈ min (|at |) ·
tanh
⊞ at , 2 tanh−1
0≤t≤j
2
t=0
t=0
If it is possible to formulate (6) with an expression involving
only + and ⊞ operations of the LLRs L0 , . . . , Lp−1 , we say
that the decoding equation is expressed in reduced form. In
such a case, the complexity becomes linear in p, and the analysis of the kernel polarization can be simplified as explained in
the next sections. Reducibility of general decoding equations
being an open problem, we conjecture that (6) cannot be
expressed in reduced form for all bit positions and all kernels;
we suggest to approximate irreducible expressions with similar
expressions in reduced form.
C. Kernel Examples
Minimum-distance spectrum of the original kernel of size 2
1 0
T2 =
(9)
1 1
is straightforward: for dimension 1, the second row is used,
achieving distance 2, while for dimension 2, both rows are
used, achieving distance 1. Thus we have ST2 = (2, 1), with
optimal row sets R12 = {1} and R22 = {0, 1}; they are nested
since R12 ⊂ R22 . As a proof of concept for the different designs
presented in Section IV, we introduce kernels
1 1 1 1 1
1 0 0 0 0
1 1 1
T3 = 1 0 1 , T5 =
(10)
1 0 0 1 0
0 1 1
1 1 1 0 0
0 0 1 1 1
for the construction of multi-kernel polar codes. We calculate
their minimum-distance profiles and their decoding equations,
showing their flexibility compared to kernels presented in [7].
The spectrum of T3 can be calculated as follows. we select
the first row (1 1 1) to maximize the minimum distance, giving
minimum distance 3 and R13 = {0}; any other row selection
would result in a smaller minimum distance, namely 2. For
a code of dimension K = 2, the last two rows, (1 0 1) and
(0 1 1), are selected, generating a code of minimum distance
2 with R23 = {1, 2}; any other row selection would result in
a smaller minimum distance. Finally, the code of dimension
K = 3 requires to select all rows having R33 = {0, 1, 2},
resulting in a code of minimum distance 1. The use of notnested row sets like R13 6⊂ R23 allows for improved minimumdistance spectrum ST3 = (3, 2, 1) compared to [7] while
keeping the same polarization rate E = 0.42.
A similar analysis for T5 leads to the minimum-distance
spectrum ST5 = (5, 3, 2, 1, 1) with optimal row sets R15 =
{0}, R25 = {3, 4}, R35 = {2, 3, 4}, R45 = {1, 2, 3, 4}
and R55 = {0, 1, 2, 3, 4}. On the other side, its polarization
rate is E = 0.359, which is worse than the rate of 0.431
achieved by optimal kernel in [7]. However, the optimal kernel
has spectrum (4, 2, 2, 2, 1), limiting the achievable minimum
distances of the full code to powers of 2; it is hard to state
if proposed T5 spectrum is better, since the full code distance
depends on dimension and code length, however it permits a
finer quantization in achievable minimum distances.
Decoding equations in reduced form can be calculated for
the presented kernel; the ones for T2 are the well known
f02 : λ0 = L0 ⊞ L1 ,
f12 : λ1 = (−1)û0 · L0 + L1 .
The ones for T3 given in (10) are
f03 : λ0 = L0 ⊞ L1 ⊞ L2 ,
f13 : λ1 = (−1)û0 · L0 + L1 ⊞ L2 ,
f23 : λ2 = (−1)û0 · L1 + (−1)û0 ⊕û1 · L2 .
Finally, the decoding equations for T5 given in (10) are
f05 : λ0 = L1 ⊞ L2 ⊞ L4 ,
f15 : λ1 = (−1)û0 · (L0 ⊞ L3 ⊞ (L2 + (L1 ⊞ L4 ))),
f25 : λ2 = (−1)û1 · (L0 ⊞ L1 ) + (L3 ⊞ L4 ),
f35 : λ3 = (−1)û0 ⊕û1 ⊕û2 · L0 + (−1)û0 · L1 + (L2 ⊞ (L3 + L4 )),
f45 : λ4 = (−1)û0 ⊕û3 · L2 + (−1)û0 ⊕û2 · L3 + (−1)û0 · L4 .
All expressions are optimal apart from f25 . Conjecturing the
original expression to be irreducible, we approximate it with a
reducible one. This partially effects the decoding performance,
however it allows for analytical determination of reliabilities
to be used for code design as shown in Section IV-A.
IV. C ODE D ESIGN
Multi-kernel polar codes introduce new options in the code
design, which for the original polar codes is limited to the
selection of the information set according to reliabilites. In
this section, we describe three design principles for multikernel polar codes, called the reliability design, the distance
design and the hybrid design, theoretically motivating them
and providing for each one a practical design algorithm.
5
A. Reliability Design
The reliability design is based on the polarization phenomenon and aims at minimizing the probability of error
under SC decoding. This design is conceived for long codes,
where channel polarization is strong enough to discriminate
the channels properly. After a brief review of the concept, we
will show how to determine these reliabilities, concluding the
section with a discussion on the optimal kernels order.
We upper-bound the error probability under SC decoding of
an (N, K) multi-kernel polar code with information set I by
X
Pe (ui ),
(11)
PeSC ≤
i∈I
where Pe (ui ) = P (ûi 6= ui |ûj = uj , j < i) denotes the probability of making a wrong decision for bit ûi assuming that
all previously decoded bits are correct. The reliability design
aims to minimize this upper bound so that the information
set I R is chosen to contain the K most reliable positions.
This is equivalent to finding the K positions minimizing the
maximum error probability within these positions; I R can
hence be found as the solution of the optimization problem
min
max Pe (ui )
i∈I
(12)
s.t. I ⊂ [N ], |I| = K.
The simplest way to calculate the reliabilities is to use
Monte-Carlo simulation, e.g. to run a genie-aided SC decoder
to estimate the error rate of each input bit. In more detail, the
all-zero codeword is transmitted over a channel with a target
design SNR, and is decoded with a modified SC decoder that
counts bit errors based on hard decisions of the LLRs but
feeds back the correct decisions. As this method requires a
large number of simulations to get stable results, we suggest
to compute the approximated reliabilities of the input bits by
density evolution under Gaussian approximation (DE/GA) [27]
instead. If we suppose LLR distributions to be Gaussian, their
variance is twice their mean value, i.e., Li ∼ N (mi , 2mi ),
permitting to follow their evolution by tracking their mean.
For DE/GA, the means are passed in the Tanner graph
from the right to the left, similarly to LLRs. Looking at the
Tanner graph block depicted in Fig. 2, we denote the mean
of λi by µi , and the mean of Lj by mj . For a BI-AWGN
transmission system,
the initial channel LLRs are distributed as
li ∼ N σ22 , σ42 , the initial mean value being mi = σ22 [28].
Under the Gaussian assumption, the error probability Pe (ui )
is in direct correspondence
with the LLR mean value µi as
p
Pe (ui ) = Q( µi /2), where Q(.) denotes the tail probability
of the standard Gaussian distribution; hence the SC error
probability can be lower bounded by
p
(13)
PeSC ≥ max Q( µi /2).
i∈I
To reduce the complexity of (12), error probabilities may be
replaced LLR mean values; I R can then be determined solving
max
s.t.
min µi
i∈I
I ⊂ [N ], |I| = K,
where µi denotes the mean value of the LLR of ui .
(14)
If kernel decoding equations (4) are expressed in reduced
form, then the equations tracking the LLR means can be
written directly: if LLRs L0 , . . . , Lj−1 are independent, then
µ(L0 + . . . + Lj−1 ) = m0 + . . . + mj−1
µ(L0 ⊞ . . . ⊞ Lj−1 ) = ϕj (m0 , . . . , mj−1 ),
(15)
(16)
where
ϕj (m0 , . . . , mj−1 ) = φ
1
φ(m) = 1 − √
4πm
−1
Z
1−
+∞
−∞
t=0
!
(1 − φ(mt )) ,
u (u−m)2
tanh e− 4m du.
2
We recall that functions φ and φ−1
am2 −bm
e
φ(m) ≈
γ
e−αm +β
√
b− b2 +4a ln m
2a 1
−1
φ (m) ≈
β−ln m γ
α
j−1
Y
(17)
(18)
can be approximated as
if 0 ≤ m < c
if m ≥ c
if 0 ≤ m < c
if m ≥ c
(19)
(20)
Parameters α = 0.4527, β = 0.0218, γ = 0.86, a = 0.0564,
b = 0.48560, c = 0.867861 are acquired by curve-fitting [29].
The decoding equations for the kernels presented in Section
II lead to the following evolution of mean values.
DE/GA for kernel T2 :
µ0 = ϕ2 (m0 , m1 )
µ1 = m0 + m1
DE/GA for kernel T3 :
µ0 = ϕ3 (m0 , m1 , m2 )
µ1 = m0 + ϕ2 (m1 , m2 )
µ2 = m1 + m2
DE/GA for kernel T5 :
µ0 = ϕ3 (m1 , m2 , m4 )
µ1 = ϕ3 (m0 , m3 , (m2 + ϕ2 (m1 , m4 )))
µ2 = ϕ2 (m0 , m1 ) + ϕ2 (m4 , m4 )
µ3 = m0 + m1 + ϕ2 (m2 , m3 + m4 )
µ4 = m2 + m3 + m4
The last point to be addressed is the selection of the order of
kernels. The kernel order has two aspects. First, transformation
matrices obtained by permuting the same kernels lead to
equivalent codes by conveniently selecting the two information
sets due to the permutation property of the Kronecker product.
However, it is hard to predict the impact of the kernel order
on the polarization of the input bits due to the non-linearity of
function φ. Therefore codes of same length and dimension but
different kernel orders may be different if reliability design is
performed; as a result, for specific code dimensions, one kernel
order may be preferable to others. We propose to perform an
exhaustive search among all possible kernel orders to find the
best one. For a code of dimension K, the metric used for the
order selection is the sum of the reliabilities of the best K
bits, and the kernel order giving the largest sum is retained.
6
B. Distance Design
In this section, we describe a design for multi-kernel polar
codes maximizing the minimum distance of the resulting
code [17]. This design is envisaged for short codes, where
the polarization effect is not strong enough to prevail over
the minimum distance properties. For an (N, K) multi-kernel
polar code, the probability of error under maximum-likelihood
(ML) decoding for the AWGN channel is bounded below as
p
(21)
PeML ≥ Q( dµ/2),
where d denotes the minimum distance of the code and µ =
2/σ 2 the mean of the channel LLR. If the information set I D
is selected to maximize dµ, then (21) is minimized; since µ
is fixed, this corresponds to solving the optimization problem
max
s.t.
dN (I)
(22)
I ⊂ [N ], |I| = K,
where dN (I) denotes the minimum distance of the code
defined by the rows of TN indexed by I. We solve this
problem in two steps: first, we calculate the optimal minimum
distance through the minimum distance spectrum of TN , then
we find the information set achieving that distance.
Algorithm 1 Information set for minimum distance
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
Initialize the sets I = ∅ and R0p = ∅
Load vector sN = (2, 1)⊗n ⊗ STp
Load optimal row sets R1p , . . . , Rpp
for k = 1 . . . K do
l = argmax(sN )
c = (l mod p)
⌋
q = ⌊ N −l−1
p
I = I \ (Rcp + qp) ∪ (Rc+1
+ qp)
p
sN (l) = 0
end for
return I
Though finding the minimum-distance spectrum of a code
is in general a complex task, for polar codes it can be
easily computed as ST ⊗n = sort([2 1]⊗n ), where the vector
2
is sorted in descending order [30]. This property can be
generalized to multi-kernel polar codes as follows.
Proposition 1. If TN = T2⊗n ⊗ Tp , then
STN = sort(ST ⊗n ⊗ STp ) = sort([2 1]⊗n ⊗ STp ).
2
(23)
Proof: The property obviously holds for n = 0. By
inductive hypothesis we now suppose that it holds for n − 1,
i.e., that STN/2 = sort(ST ⊗n−1 ⊗ STp ). Defining aU = STN/2 ,
2
aL = 2STN/2 , such that a = sort([aU , aL ]), the proposition is
proved if a = STN since
sort(STN/2 , 2STN/2 ) = sort([2
= sort([2
= sort([2
= sort([2
1] ⊗ STN/2 ]) =
1] ⊗ sort([1, 2] ⊗ STN/4 )) =
1] ⊗ [1, 2] ⊗ STN/4 ) = . . . =
1]⊗n ⊗ STp ).
T
0
defined by K
Consider a subcode of TN = TN/2
T
N/2
N/2
U
U
T
0 , K L rows
rows, with K rows from T
=
N/2
from T L = TN/2 TN/2 , and K U + K L = K; denote the corresponding submatrices of T U and T L by TAU
and TBL , respectively. If row indices are selected such that
d(TAU ) = aU (K U ) and d(TBL ) = aL (K L ), which is possible
by the induction hypotheses, then the minimum distance of
this subcode is
U (a)
TA
d
= min d(TAU ), d(TBL ) =
L
TB
(b)
= min aU (K U ), aL (K L ) = a(K)
where (a) follows from the distance property of the (u|u + v)
construction [31] and (b) from the sorting of two sorted lists.
Proposition 1 requires the transformation matrix of the code
to be in the form TN = T2⊗n ⊗ Tp . Note that this is not a
limiting construction for the minimum-distance construction,
since the minimum-distance spectrum does not depend on the
order of the kernels in the Kronecker product. However, this
structure permits to divide TN into 2n sub-matrices of p rows,
termed as sectors in the following, each one consisting of
a vector of Tp kernels and all-zero matrices. Analogously,
n
information sets I of size K can be split into
P 2 smaller
information sets Iq ⊆ [p], |Iq | = Kq ≤ p and q Kq = K,
where each Iq collects the rows of the q-th sector included
in I. Since the minimum distance spectrum of q-th sector is
given by Sq = 2wt(q) · STp , where wt(q) is the number of
ones of q-th row of T2⊗n , this division permits to identify
the contribution of each sector to the minimum distance of
the code. Due to the distance property of the (u|u + v)
construction, dN (I) = min (dp (I1 ), . . . , dp (I2n )); if Iq is
K
formed by the optimal row set Rp q , then dp (Iq ) = Sq (Kq ).
This concept is exploited in greedy Algorithm 1 to design
multi-kernel polar codes with optimal minimum distance. The
algorithm adds sequentially row indices to the information set
I, modifying the information set of dimension K −1 to obtain
the one for dimension K. Vector sN = [S1 | . . . |S2n ] formed
by the minimum-distance spectra of the individual sectors is
initially calculated as sN = (2, 1)⊗n ⊗STp ; note that the vector
is not sorted. At step k, the position l of the largest entry in
sN is extracted as b = sN (l), and sN (l) is set to zero. Then b
represents the best minimum distance achievable by the code
for dimension K; q defined in line 7 identifies the sector to be
updated to reach that distance; index c = l mod p represents
the number of rows of sector q already included in I. To
increase the value of c by one, the algorithm substitutes the
previous optimal row set Rcp with the following one, i.e. Rc+1
p ,
in line 8. The optimal row sets of the individual sectors need
to be shifted by qp to be properly included in I. The algorithm
stops when I comprises K elements.
Algorithm 1 requires the minimum-distance spectrum STp
and the optimal row sets of kernel Tp . A brute force calculation
may be prohibitive for large kernels, since it is required to
check the distances generated by all the kp possible k-rows
sub-matrices of Tp . However, if Tp = Tp1 ⊗ Tp2 , the kernel Tp
can be divided into p1 sectors of p2 rows, each one formed
7
Algorithm 2 Spectrum of Kronecker product of kernels
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
optimal row sets associated to h1, 3i is given by (24) as
Rh1,3i = R1p2 + 1 · 3 ∪ R3p2 + 2 · 3 = {3, 6, 7, 8} (25)
Load optimal row sets R1p1 , . . . , Rpp11 , R1p2 , . . . , Rpp22
for k = 1 . . . p do
STp (k) = 0
κ = ListPartition(k, p1 , p2 )
for ℓ = 1 . . . length(κ) do
hk1 , . . . , kt i= κ(ℓ)
St
k
R = j=1 Rp2j + Rtp1 (j) · p2
m = MinDist(Tp (R, :))
if m > STp (k) then
STp (k) = m
Rkp = R
end if
end for
end for
return R1p , . . . , Rpp , STp
with minimum distance MinDist(T9 (Rh1,3i , :)) = 2; similarly,
Algorithm 2 calculates Rh2,2i = {4, 5, 7, 8} with minimum
distance 4 and Rh1,1,2i = {0, 3, 7, 8} with minimum distance
3. Algorithm 2 selects R49 = Rh2,2i = {4, 5, 7, 8}, with
ST9 (4) = 4. The complete minimum-distance spectrum of
T9 obtained by running Algorithm 2 for k = 1, . . . , 9 is
ST9 = (9, 6, 4, 4, 3, 2, 2, 2, 1). Note that this spectrum is
optimal, as we verified by an exhaustive search. In general
the spectrum achieved by Algorithm 2 may be suboptimal.
C. Hybrid Design
by the juxtaposition of kernels Tp2 . Then the set of k row
indices, given by Rkp , is partitioned according to the sectors,
indexed by the set {i1 , ..., it }, and within
Pt each sector there
are kj rows, j = 1, . . . , t, where k = j=1 kj . Given this
structure, we propose to limit the search space for the sector
indices and for the index sets within the sectors to optimal
row sets of the component kernels.
In more detail, an optimal row set Rtp1 = {i1 , . . . , it }
identifies the indices of the sectors that will contribute to
k
Rkp ; for every retained sector ij , an optimal row set Rp2j is
k
included in Rp . For each k, all the possible combinations of
t and kj have to be checked. Algorithm 2 performs this task,
comparing the minimum distances of the row sets generated
by all integer partitions of k of maximum length p1 , i.e. the set
of all possible ways of writing k as a sum of up to p1 positive
integers notP
larger than p2 . The integer partition hk1 , . . . , kt i,
t
where k = i=1 ki , unambiguously identifies row set
Rhk1 ,...,kt i =
t
[
j=1
Rkp2j + ij · p2 ,
(24)
where Rtp1 = {i1 , . . . , it } with i1 < . . . < it . In practice, t
sectors are included in the row set, whose indices are listed in
Rtp1 . The j-th sector contributes with kj rows, that are chosen
according to the optimal row set of kernel Tp2 .
As an example, we show the steps performed by Algorithm 2 to compute the optimal row set R49 for kernel
1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1
0 1 1 0 1 1 0 1 1
1 1 1 0 0 0 1 1 1
T9 = T3 ⊗ T3 =
1 0 1 0 0 0 1 0 1 .
0 1 1 0 0 0 0 1 1
0 0 0 1 1 1 1 1 1
0 0 0 1 0 1 1 0 1
0 0 0 0 1 1 0 1 1
In this case, only 3 integer partitions of k = 4 respect the
required properties, namely h1, 3i, h2, 2i and h1, 1, 2i. The
The reliability design is conceived for SC decoding, and
therefore suited for very long codes, where SC decoding
becomes asymptotically optimal. The distance design, on the
other hand, is assuming ML decoding, and thus is suited
for short codes under SCL decoding, where moderate list
lengths approximate ML decoding very well. The hybrid
design combines reliability and distance as design criteria, and
it is particularly effective to construct multi-kernel polar codes
for medium code lengths under SCL decoding.
To introduce the hybrid design principle, we partition the
transformation matrix (1) of a multi-kernel polar code as
TN = TN r ⊗ TN d ,
(26)
with TNr = Tp1 ⊗. . .⊗Tpψ and TNd = Tpψ+1 ⊗. . .⊗Tps . The
two matrices TNr and TNd can be treated as transformation
matrices of smaller multi-kernel polar codes, of length Nr =
p1 · . . . · pψ and Nd = pψ+1 · . . . · ps respectively, with N =
Nr · Nd ; corresponding Tanner graph is depicted in Fig. 3.
The idea is to apply the distance design to the left part of the
graph, consisting of TNd blocks, and the reliability design to
the right part of the graph, consisting of TNr blocks; indices
‘d’ and ‘r’ stand for distance and reliability, respectively. This
hybrid design comprises a parameter, ψ, that allows to trade
distance vs. reliability. Reliability and distance designs can be
seen as extreme cases with ψ = s and ψ = 0 respectively.
Consider now the following decoding principle. The normal
SC decoder proceeds until all right-messages are available at
the input to the first TNd block. The block makes a local
ML decision, i.e., it decides for the most likely codeword,
and this decision is fed back into the SC decoding process.
SC decoding proceeds until all right-messages are available at
the input to the second TNd block, which makes a local ML
decision and feeds the result back into the decoding process.
This continues until the last TNd block has made its decision.
Note that this decoding principle can be approximated by a
plain SCL decoder, where larger values of ψ may require larger
list sizes to reach ML decoding of left blocks.
Consider now the probability of error of the local ML
decision at the i-th TNd block, denoted by Pe (ui ) for any
i = 0, 1, . . . , Nr − 1, assuming all previous decisions being
error-free. By SC decoding, all incoming messages have the
same reliability; imposing Gaussian approximation on the
message densities, we denote the mean of the incoming
8
D. Construction Example
Tb
TNa
d
TNa
d
TNa
d
r
Fig. 3: Tanner graph of TN = TNr ⊗TNd for the hybrid design.
message density by µi and the minimum distance of the local
code, as imposed by the information set on the TNd block, by
di . The probability of error can thus be lower-bounded by
p
Pe (ui ) ≥ Q( di µi /2),
(27)
while for the overall decoder is lower-bounded as
p
PeMLSC ≥ max Pe (ui ) ≥ max Q( di µi /2).
i∈[Nr ]
i∈[Nr ]
(28)
The information set I H of the hybrid design is selected to
minimize this lower-bound, namely solving the optimization
problem maximizing the minimum of the terms di µi as
max
s.t.
min di (I)µi
i∈[Nr ]
(29)
I ⊂ [N ], |I| = K,
where di (I) denotes the minimum distance of the code
induced by I over the i-th TNd block. This hybrid design
minimizes the error rate for the mixed ML-SC decoder as
described above. Note that both the bound (28) and the
information set (29) are for a fixed value of ψ ∈ [0, s], which
may be adapted to the available list length of the SCL decoder.
The optimisation for the information set, as given in (29),
can be solved slightly modifying Algorithm 1, introduced
for the distance design in the previous section. Initially, the
reliabilities of the Nr input bits of the partial transformation
matrix TNr are determined using DE/GA, as described in
Section IV-A. These reliabilities are stored in an intermediate
vector µ = (µNr −1 , . . . , µ0 ), where µi represents the reliability of the i-th input bit of the code generated by TNr .
At the same time, the minimum distance spectrum of the
partial transformation matrix TNd is computed, along with the
optimal rows sets, using the methods for the distance design,
as described in Section IV-B. Finally, vector dN = µ ⊗ STNd ,
representing the ”hybrid” spectrum of TN , is calculated and
given to Algorithm 1, along with the optimal rows sets
calculated previously, to design the information set of the code.
We illustrate the proposed designs through a multi-kernel
polar code of length N = 12 and dimension K = 4 with
transformation matrix T12 = T2⊗2 ⊗ T3 depicted in Fig. 1, i.e.
1 1 1 0 0 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0 0 0 0
1 1 1 1 1 1 0 0 0 0 0 0
1 0 1 1 0 1 0 0 0 0 0 0
0 1 1 0 1 1 0 0 0 0 0 0
,
T12 =
1 1 1 0 0 0 1 1 1 0 0 0
1 0 1 0 0 0 1 0 1 0 0 0
0 1 1 0 0 0 0 1 1 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1
1 0 1 1 0 1 1 0 1 1 0 1
0 1 1 0 1 1 0 1 1 0 1 1
for a BI-AWGN channel with inputs ±1 and σ 2 = 0.5.
1) Reliability design: Using DE/GA, the reliabilities are
calculated as (0.09, 1.28, 2, 1.85, 7.3, 9.12, 2.75, 9.57, 11.56,
11.94, 29.42, 32). The K = 4 most reliable positions form
the information set: I R = {8, 9, 10, 11}.
2) Distance design: The minimum-distance spectrum and
the optimal row sets of the kernel T3 depicted in (10) are
ST3 = (3, 2, 1) with R13 = {0}, R23 = {1, 2} and R33 =
{0, 1, 2}; the auxiliary vector is calculated as d12 = (2, 1)⊗2 ⊗
(3, 2, 1) = (12, 8, 4, 6, 4, 2, 6, 4, 2, 3, 2, 1). Algorithm 1 for
K = 4 gives then I D = {3, 6, 10, 11}.
3) Hybrid design: Transformation matrix T12 includes s =
3 kernels, thus ψ ∈ {0, 1, 2, 3}. The hybrid design results in
the reliability design for ψ = 3 and in the distance design
for ψ = 0. For ψ = 2, we have TNr = T2⊗2 and TNd = T3 .
The reliabilities of TNr are determined by DE/GA, resulting in
µ = (16, 5.78, 4.56, 1), while ST3 has been determined above.
Thus we obtain the mixed spectrum s12 = µ ⊗ ST3 = (48,
32, 16, 17.34, 11.56, 5.78, 13.68, 9.12, 4.56, 3, 2, 1) for the
transformation matrix T12 . This vector along with the optimal
row sets for T3 is used as input to Algorithm 1, giving the
information set I H = {6, 9, 10, 11} for K = 4. For ψ = 1
instead, we have TNr = T2 and TNd = T2 ⊗ T3 , with vector
µ = (8, 2.28). The minimum-distance spectrum of TNd can be
calculated by Algorithm 2 as ST2 ⊗T3 = (6, 4, 3, 2, 2, 1), with
R16 = {3}, R26 = {4, 5}, R36 = {0, 4, 5}, R46 = {0, 3, 4, 5},
R56 = {1, 2, 3, 4, 5} and R66 = {0, 1, 2, 3, 4, 5}. Algorithm 1
takes then as inputs d12 = µ ⊗ SGd = (48, 32, 16, 24,
16, 8, 13.68, 9.12, 4.56, 6.84, 4.56, 2.28) and provides the
information set I H = {6, 9, 10, 11}.
The four designs are summarized in Table I. As expected,
distance design leads to the best minimum distance of 6, while
the other designs yield a minimum distance of 4. In this case,
ψ = 1, 2 result in the same information set, which is however
different from the reliability design.
V. N UMERICAL E XAMPLES
In this section we evaluate the performance of the proposed
multi-kernel polar codes under the different designs. All the
simulations determine the BLock Error Rate (BLER) of the
9
ψ
I
min. dist.
3 (rel.)
{8, 9, 10, 11}
4
2
{6, 9, 10, 11}
4
1
{6, 9, 10, 11}
4
0 (dist.)
{3, 6, 10, 11}
6
TABLE I: I and minimum distances for (12, 4) codes.
codes for BI-AWGN channels under SCL decoding, usually
for list size L = 8. To begin with, we evaluate the impact
of the kernel order in the multi-kernel construction. Next,
the impact of parameter ψ of the hybrid design is studied.
Then we compare multi-kernel polar codes to punctured and
shortened polar codes of same length and dimension. Finally,
we compare multi-kernel polar codes with standard codes of
same length and dimension, namely with LDPC codes for
802.11n [32] and polar codes for 5G NR [4].
10 -1
-2
10 -3
=3
=3
=3
=3
=3
=3
=3
10 0
10 -1
10 -2
BLER
BLER
10
p7
p6
p5
p4
p3
p2
p1
Simulations confirm that the kernel order has some impact on
the performance of multi-kernel polar codes under reliability
design; no systematic behavior could be identified at this stage.
Figure 5 shows the BLER performance of multi-kernel polar
codes of length N = 384 = 27 · 3 and rate R = 1/2 under
hybrid design for different values of parameter ψ introduced
in Section IV-C. All the simulations are performed under SCL
decoding with list size L = 8. In this case, the number of
kernels composing the transformation matrix is s = 8, and
the single T3 kernel is placed in the last position, having
p8 = 3. Parameter ψ can hence span from 0 to 8, where the
extreme cases ψ = 0 and ψ = 8 corresponding to distance and
reliability designs, respectively, are highlighted with different
colors. As expected for mid-length codes, distance design
performs worse than reliability design, however the code is not
long enough to have strong polarization. In this case, the mixed
design offers an advantage over the other two designs due
to higher flexibility. Simulations show that the performance
strongly depends on the choice of parameter ψ; a clear pattern
is not recognizable and needs further studies. In the following,
we set ψ = ⌈ s−1
2 ⌉ as a rule of thumb for the hybrid design.
10 -4
10 -5
2.5
10
10 -4
3
3.5
4
4.5
5
5.5
Eb /N0 (dB)
Fig. 4: BLER performance of (192, 96) multi-kernel polar
codes under reliability design for different position of the
unique T3 kernel and list size L = 8.
Figure 4 shows the BLER performance of multi-kernel polar
codes of length N = 192 = 26 · 3 and rate R = 1/2,
designed according to reliability, under SCL decoding with
list size L = 8. The transformation matrix of this code is
constructed mixing 6 kernels T2 and a single T3 kernel. There
are hence 7 possible configurations of (1), namely depending
on the position of the T3 kernel in the Kronecker product.
According to Figure 4, the performance gain between the
best and the worst design for this code is about 0.5 dB. In
particular, the best performance is obtained when p2 = 3,
i.e. when T3 is placed in second position, while the worst
performance is attained by switching the first two kernels of
the best design, with p1 = 3. Note also that the slope of
the curves is different, which is due to differences in their
distance properties. The metric proposed in Section IV-A for
the selection of the kernel order suggests setting p3 = 3,
resulting in average performance at low SNR but quickly
approaching the performance of p2 = 3 at higher SNR.
=
=
=
=
=
=
=
=
=
0 (dist)
1
2
3
4
5
6
7
8 (rel)
1.5
2
ψ
ψ
ψ
ψ
ψ
ψ
ψ
ψ
ψ
-3
1
2.5
3
3.5
4
4.5
5
Eb /N0 (dB)
Fig. 5: BLER performance of (384, 192) multi-kernel polar
codes under hybrid design for different values of ψ and L = 8.
Figure 6 shows a comparison between the presented multikernel polar codes under various designs and rate-matched
polar codes for rate R = 1/2 under SCL decoding with L = 8.
Polar codes of length N are generated from a mother polar
code of length M = 2log 2(⌈N ⌉) through puncturing according
to [10] and shortening according to [11]. The information
sets of mother polar codes are hence calculated by either
puncturing the first M − N bits or shortening the last M − N
bits and calculating bit reliabilities through DE/GA [13]. Code
lengths are selected to have a wide range of possibilities.
In Figure 6a the code length is set to N = 144 = 32 · 24 .
This length can be reached by multi-kernel polar codes with
two T3 kernels and four T2 kernels, while it demands strong
puncturing/shortening by 144 bits to apply rate-matching from
a mother polar code of length M = 256. Figure 6b shows the
performance of codes of length N = 200 = 52 ·23 , reached by
multi-kernel polar codes using two T5 kernels in conjunction to
10
10 -1
10 -1
10 -1
10 -2
10 -2
10 -2
10 -3
BLER
10 0
BLER
10 0
BLER
10 0
10 -3
MK - Dist
MK - Hyb
MK - Rel
PC - short
PC - punct
10 -4
1
1.5
2
10 -3
MK - Dist
MK - Hyb
MK - Rel
PC - short
PC - punct
10 -4
2.5
3
3.5
4
4.5
5
5.5
6
1
1.5
2
MK - Dist
MK - Hyb
MK - Rel
PC - short
PC - punct
10 -4
2.5
3
Eb /N0 (dB)
3.5
4
4.5
5
5.5
6
1
1.5
2
2.5
3
Eb /N0 (dB)
(a) N = 144
(b) N = 200
3.5
4
4.5
5
5.5
6
Eb /N0 (dB)
(c) N = 90
Fig. 6: BLER performance of multi-kernel polar codes compared with punctured [10] and shortened [11] polar codes of rate
R = 1/2 under SCL decoding with L = 8.
T2 kernels. The last Figure 6c mixes all the presented kernels
showing the performance of codes of length N = 90 = 5·32 ·2.
Overall, we can see that reliability design of multi-kernel
polar codes behaves similarly to the best rate-matching strategy among puncturing and shortening. The sub-optimality of
T5 LLR equation f25 has an impact on the performance of
the reliability design for short codes, as shown in Figure 6b.
However, other designs are not impacted excessively by this
issue, exhibiting good performance in this case. Moreover, distance design outperforms other designs for short codes, where
minimum distance have still more impact than polarization
effect; hybrid design permits a tradeoff between polarization
and distance, always outperforming rate-matched polar codes.
10 0
10 -1
BLER
10 -2
10 -3
10 -4
10 -5
0.5
LDPC, R = 1/2
MK-CRC, R = 1/2
5G polar, R = 1/2
LDPC, R = 2/3
MK-CRC, R = 2/3
5G polar, R = 2/3
LDPC, R = 3/4
MK-CRC, R = 3/4
5G polar, R = 3/4
LDPC, R = 5/6
MK-CRC, R = 5/6
5G polar, R = 3/4
1
1.5
of 10 CRC bits to help SCL decoding [2]. List size is set to
L = 8 for both multi-kernel polar codes and 5G polar codes;
LDPC codes are decoded using a 10-iterations offset min-sum
decoder. Results show that the proposed multi-kernel polar
codes are comparable to state-of-the-art channel codes.
VI. C ONCLUSIONS
In this paper, we proposed a generalized polar code construction based on multiple kernels, termed multi-kernel polar
codes. Though encoding and decoding resemble those of the
original polar codes, as proposed by Arikan, multi-kernel polar
codes provide various new design options. We presented such
new code design principles based on reliability, distance, and
a mix of those two as design criteria, coined as hybrid design,
allowing to adapt the design to a given list length of the
SCL decoder. The error-rate performance of multi-kernel polar
codes was evaluated by simulations, resulting to be superior
to state-of-the-art polar-code constructions, using puncturing
or shortening methods, and state-of-the-art LDPC codes.
This paper focused on the information set design of multikernel polar codes. the design of optimal kernels or the
optimization of hybrid design are not addressed in this paper
and left for future research; the presented tools for analysis
and design are believed to be useful for these purposes.
R EFERENCES
2
2.5
3
3.5
4
4.5
Eb /N0 (dB)
Fig. 7: BLER performance comparison among multi-kernel
polar codes with 8 CRC bits, LDPC codes in [32] and 5G
polar codes [4] for length N = 1944.
Figure 7 shows a comparison among multi-kernel polar
codes, LDPC codes of the 802.11n standard [32] and polar
codes standardized in 5G [4]. The 802.11n standard specifies
the three code lengths 1944, 1296 and 648, and the four code
rates 1/2, 2/3, 3/4 and 5/6; we show simulation results for
N = 1944 and all four admissible rates. Multi-kernel polar
codes are designed according to reliability, with the addition
[1] E. Arikan, “Channel polarization: a method for constructing capacityachieving codes for symmetric binary-input memoryless channels,”
IEEE Transactions on Information Theory, vol. 55, no. 7, pp. 3051–
3073, July 2009.
[2] I. Tal and A. Vardy, “List decoding of polar codes,” in IEEE
International Symposium on Information Theory (ISIT), St. Petersburg,
Russia, July 2011.
[3] K. Niu and K. Chen, “CRC-aided decoding of polar codes,” IEEE
Communications Letters, vol. 16, no. 10, pp. 1668–1671, 2012.
[4] V. Bioglio, C. Condo, and I. Land, “Design of polar codes in 5G New
Radio,” in arXiv preprint arXiv:1804.04389., April 2018.
[5] S. B. Korada, E. Sasoglu, and R. Urbanke, “Polar codes: Characterization of exponent, bounds, and constructions,” IEEE Transactions on
Information Theory, vol. 56, no. 12, pp. 6253–6264, Dec. 2010.
[6] N. Presman, O. Shapira, S. Litsyn, T. Etzion, and A. Vardy, “Binary
polarization kernels from code decompositions,” IEEE Transactions on
Information Theory, vol. 61, no. 5, pp. 2227–2239, May 2015.
11
[7] H.-P. Lin, S. Lin, and K. Abdel-Ghaffar, “Linear and nonlinear binary
kernels of polar codes of small dimensions with maximum exponents,”
IEEE Transactions on Information Theory, vol. 61, no. 10, pp. 5253–
5270, Oct. 2015.
[8] R. Mori and T. Tanaka, “Non-binary polar codes using Reed-Solomon
codes and algebraic geometry codes,” in IEEE Information Theory
Workshop (ITW), Dublin, Ireland, September 2010.
[9] N. Presman, O. Shapira, and S. Litsyn, “Mixed-kernels constructions of
polar codes,” IEEE Journal on Selected Areas in Communications, vol.
34, no. 2, pp. 239–253, 2016.
[10] K. Niu, K. Chen, and J.-R. Lin, “Beyond turbo codes: Rate-compatible
punctured polar codes,” in IEEE International Conference on Communications (ICC), Budapest, Hungary, June 2013.
[11] R. Wang and R. Liu, “A novel puncturing scheme for polar codes,”
IEEE Communications Letters, vol. 18, no. 12, pp. 2081–2084, Dec.
2014.
[12] L. Zhang, Z. Zhang, X. Wang, Q. Yu, and Y. Chen, “On the puncturing
patterns for punctured polar codes,” in IEEE International Symposium
on Information Theory (ISIT), Hawaii, U.S.A., July 2014.
[13] V. Bioglio, F. Gabry, and I. Land, “Low-complexity puncturing and
shortening of polar codes,” in IEEE Wireless Communications and
Networking Conference (WCNC), San Francisco, USA, March 2017.
[14] M. K. Lee and K. Yang, “The exponent of a polarizing matrix constructed from the kronecker product,” Designs, codes and cryptography,
vol. 70, no. 3, pp. 313–322, March 2014.
[15] F. Gabry, V. Bioglio, I. Land, and J.-C. Belfiore, “Multi-kernel
construction of polar codes,” in IEEE International Conference on
Communications (ICC), Paris, France, May 2017.
[16] M. Benammar, V. Bioglio, F. Gabry, and I. Land, “Multi-kernel polar
codes: Proof of polarization and error exponents,” in IEEE Information
Theory Workshop (ITW), Kaohsiung, Taiwan, Nov. 2017.
[17] V. Bioglio, F. Gabry, I. Land, and J.-C. Belfiore, “Minimum-distance
based construction of multi-kernel polar codes,” in IEEE Global
Communications Conference (GLOBECOM), Singapore, Dec. 2017.
[18] M. Mondelli, S. H. Hassani, and R. L. Urbanke, “From polar to ReedMuller codes: a technique to improve the finite-length performance,”
IEEE Transactions on Communications, vol. 62, no. 9, pp. 3084–3091,
2014.
[19] R. Mori and T. Tanaka, “Channel polarization on q-ary discrete
memoryless channels by arbitrary kernels,” in IEEE International
Symposium on Information Theory (ISIT), Austin, Texas, USA, June
2010.
[20] D. S. Bernstein, Matrix mathematics: Theory, facts, and formulas with
application to linear systems theory, Princeton University Press, 2005.
[21] A. Alamdar-Yazdi and F. R. Kschischang, “A simplified successivecancellation decoder for polar codes,” IEEE communications letters,
vol. 15, no. 12, pp. 1378–1380, 2011.
[22] K. Niu and K. Chen, “Stack decoding of polar codes,” Electronics
letters, vol. 48, no. 12, pp. 695–697, 2012.
[23] G. Bonik, S. Goreinov, and N. Zamarashkin, “Construction and analysis
of polar and concatenated polar codes: practical approach,” in arXiv
preprint, arXiv:1207.4343, July 2012.
[24] Z. Huang, S. Zhang, F. Zhang, C. Duanmu, and M Chen, “On the
successive cancellation decoding of polar codes with arbitrary binary
linear kernels,” in arXiv preprint, arXiv:1701.03264, Jan. 2017.
[25] H. Griesser and V. R. Sidorenko, “A posteriory probability decoding
of nonsystematically encoded block codes,” Problems of Information
Transmission, vol. 38, no. 3, pp. 182–193, Mar. 2002.
[26] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block
and convolutional codes,” IEEE Transactions on Information Theory,
vol. 42, no. 2, pp. 429–445, 1996.
[27] R. Mori and T. Tanaka, “Performance of polar codes with the construction using density evolution,” IEEE Communications Letters, vol. 13,
no. 7, pp. 519–521, July 2009.
[28] S. Y. Chung, T. J. Richardson, and R. L. Urbanke, “Analysis of sumproduct decoding of low-density parity-check codes using a Gaussian
approximation,” IEEE Transactions on Information Theory, vol. 47, no.
2, pp. 657–670, 2001.
[29] J. Ha, J. Kim, and S. W. McLaughlin, “Rate-compatible puncturing
of low-density parity-check codes,” IEEE Transactions on Information
Theory, vol. 50, no. 11, pp. 2824–2836, 2004.
[30] N. Hussami, S. B. Korada, and R. Urbanke, “Performance of polar codes
for channel and source coding,” in IEEE International Symposium on
Information Theory (ISIT), Seoul, Korea, June 2009.
[31] F. J. MacWilliams and N. J. A. Sloane, The theory of error-correcting
codes, vol. 16, Elsevier, 1977.
[32] “IEEE standard for information technology - local and metropolitan area
networks - specific requirements - part 11: Wireless LAN medium access
control (MAC) and physical layer (PHY) specifications,” Mar 2012.