DPCM SCC

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Improving Inter Prediction in HEVC with Residual

DPCM for Lossless Screen Content Coding



Matteo Naccari
1
, Saverio G. Blasi
2
, Marta Mrak
1
, Ebroul Izquierdo
2

1
British Broadcasting Corporation R&D, 56 Wood Lane, W12 7SB, London UK
Queen Mary, University of London, Mile End Road, E1 4NS, London UK
1
{matteo.naccari, marta.mrak}@bbc.co.uk,
2
{sgblasi, ebroul.izquierdo}@eecs.qmul.ac.uk

Abstract Video content containing computer generated objects
is usually denoted as screen content and is becoming popular in
applications such as desktop sharing, wireless displays, etc.
Screen content images and videos are characterized by high
frequency details such as sharp edges and high contrast image
areas. On these areas classical lossy encoding tools spatial
transform plus quantization may significantly compromise
their quality and intelligibility. Therefore, lossless coding is used
instead and improved coding tools should be specifically devised
for screen content. In this context this paper proposes a residual
differential pulse code modulation (RDPCM) applied to inter
predicted residuals and tested in the context of the HEVC range
extension development. The proposed method exploits the spatial
correlation present in blocks containing edges or text areas which
are poorly predicted by motion compensation. In addition to the
baseline inter RDCPM, two improvements to the compression
efficiency and the overall throughput are presented and assessed.
When compared to HEVC lossless coding as specified in Version
1 of the standard, the proposed algorithm achieves up to 8%
average bitrate reduction while not increasing the overall
decoding complexity.
I. INTRODUCTION
Screen content refers to images and videos which contain
computer generated objects or screen shots from computer
applications. This kind of content requires efficient
compression solutions as its use is becoming more popular in
emerging technologies such as desktop sharing, video walls in
control rooms, wireless display and digital remote operating
rooms for surgeries [1], [2]. Screen content differs significantly
from the camera captured content due to the presence of high
frequency features such as sharp edges and high contrast areas.
The presence of these features reduces the coding efficiency of
classical hybrid block-based image and video codecs which use
spatial transforms to compact the energy of signals into a few
lower frequency coefficients. Moreover, the quantization
applied by the aforementioned lossy codecs may severely blur
details in text areas compromising the intelligibility of the
whole content. For these reasons lossless coding techniques
may be preferable for screen content applications and novel
coding tools should be devised to improve the compression
efficiency of image and video coding standards.
This need has gained further evidence given the current
standardization activities inside the joint collaborative team on
video coding (JCT-VC) which is a joint partnership between
ISO/IEC MPEG and ITU-T VCEG. The JCT-VC defined the
high efficiency video coding (HEVC) standard [3], which
guarantees about 50% bitrate reduction at the same subjective
quality of its predecessor, the H.264/AVC standard [4].
Version 1 of HEVC was finalized in January 2013 [5]. Since
then the JCT-VC has concentrated efforts on the development
of scalable, 3D and range extensions of HEVC. More
specifically, the HEVC range extension (HEVC-RExt)
addresses compression of content represented with more than
10 bits per sample, chroma sampling other than the 4:2:0
supported in Version 1, support for alpha channel and
improved lossless coding tools for screen content. On this latter
aspect, it is worth mentioning that HEVC Version 1 already
supports the lossless coding mode by simply signaling the so-
called transform-bypass flag for each coding unit (CU). When
this flag is set to 1, the spatial transform, quantization and in-
loop filter processes are skipped and only intra or inter
prediction is performed followed by entropy encoding. While
using the transform-bypass flag allows lossless coding without
introducing any additional modules, the associated
compression efficiency may not be satisfactory. It is well
known that the pixels inside each coded block can be used to
improve prediction, exploiting the spatial redundancy [6],
especially in applications that require a high quality of decoded
content. Therefore such approaches are suitable for high-
fidelity coding.
In this context this paper proposes an inter Residual
Differential Pulse Code Modulation (inter RDPCM) applied to
motion compensated residuals in lossless screen content coding
(SCC) scenarios. The novelty brought by this paper is twofold:
first, the proposed inter RDPCM is applied to the HEVC
standard at three different levels of granularity, namely the
coding unit (CU), prediction unit (PU) or transform unit (TU)
level. In particular, three DPCM prediction modes (vertical,
horizontal or no DPCM) are considered independently. Second,
two additional tools are proposed for inter RDPCM: prediction
Chunking (PC) and Hierarchical Prediction (HP). PC can be
used to improve the overall throughput, thus decreasing
complexity. On the other hand, HP can be used to improve the
compression efficiency of the proposed inter RDPCM method.
The remainder of this paper is organized as follows. Section
II briefly reviews the background work related to lossless SCC.
Section III presents the proposed inter RDPCM together with
the HP and PC coding tools. Section IV reports the
experimental results to assess the inter RDPCM performance
while Section V concludes the paper.
II. RELATED WORK
For applications where high compression ratios are sought
361 978-1-4799-0294-1/13/$31.00 2013 IEEE PCS 2013
it is common to use lossy coding based on quantization which
is often coupled with a transform. In applications that require
lossless coding the introduction of distortion into the signal is
prevented by skipping the transform/quantization and in-loop
filtering stages. In HEVC, this leads to direct entropy coding of
the residual samples. Skipping the transform stage is typically a
suboptimal solution for compression. Therefore, it is crucial
that the prediction step is further optimized to achieve efficient
coding performance. Several techniques have been proposed to
improve the prediction accuracy for lossless coding.
Differential pulse code modulation has been previously
used to better exploit spatial redundancy in the video content.
Sample-by-sample RDPCM of intra-predicted residuals was
proposed [7] in the context of H.264/AVC lossless coding.
When using this technique, instead of performing conventional
intra-prediction each residual sample is predicted from
neighboring residuals in the vertical or horizontal direction
when the intra prediction is equal to one of these two
directions. Average bitrate reductions of 12% were reported
using this technique compared with conventional H.264/AVC
lossless coding. This technique was later extended and adapted
to the HEVC standard [8] achieving on average 8.4% bitrate
reductions on screen content sequences. A similar technique
also based on vertical and horizontal RDPCM was applied at
the macroblock level to inter-predicted residuals in the context
of H.264/AVC lossless coding inter-prediction [9] obtaining
average bitrate reductions around 3.5%.
Other methods have been proposed to increase the
efficiency of lossless video coding. Recently, sample based
angular intra-prediction (SAP) [10] was introduced for HEVC
lossless coding. When using this technique each sample inside
a PU is predicted using samples in the same PU. The prediction
samples can be extracted from the PU at different angles not
limited to the horizontal or vertical prediction directions. By
performing intra-prediction on a sample-by-sample basis
within a PU, spatial redundancy can be better exploited which
results in up to 11.8% bitrate reduction compared to
conventional HEVC lossless coding. It has been shown that a
simplified version of SAP [11] is conceptually identical to
vertical and horizontal intra-prediction RDPCM while still
providing a bitrate reduction of up to 10.5% for screen content.
III. RESIDUAL DPCM IN HEVC INTER-PREDICTION
This section presents inter RDPCM and its application to
the codec considered for the HEVC-RExt. First some general
considerations are discussed followed by the proposed inter
DPCM together with the hierarchical prediction and prediction
chunking tools.
A. General considerations and the HEVC coding structure
The HEVC standard performs inter-prediction by means of
block-based motion compensation which assumes that all the
pixels inside a block move approximately with the same
motion. This assumption leads to poor prediction performance
along sharp edges. In screen content it is reasonable to expect
that inter-prediction residuals still present some correlation
along image edges, which can be exploited by performing a
spatial DPCM along the edge direction. This intuition is the
basis for the proposed inter RDPCM. Several directions may be
considered; however, to limit the computational complexity,
only horizontal and vertical ones are included since they are
predominant in screen content. As stated in the Introduction,
this paper investigates inter RDPCM application at different
levels of granularity corresponding to the HEVC coding
structure, briefly reviewed in the following. The HEVC
standard makes use of a flexible partitioning where image areas
are recursively split according to a quad-tree fashion [3]. More
precisely, the encoder divides each frame into a grid of non-
overlapping coding tree units (CTUs) which identify square
regions of NN luma samples. Then each CTU can be split
using a quad-tree partitioning. Each level of this partitioning
leads to a set of CUs where the coding process takes place.
Each CU can be further split for inter-prediction into up to four
PUs, according to a set of possible coding modes. When the
prediction process is finished, each CU is split again for
entropy coding following another quad-tree partitioning. This
partitioning leads to a set of TUs where transformation,
quantization and entropy coding are carried out.
B. General method for inter RDPCM
Let r(i, j) be the elements of an MN residual block of
inter-predicted luma or chroma samples where M and N are the
block height and width respectively. The vertical inter RDPCM
mode is defined as follows. The samples in the first row in the
block are left unchanged. All other samples are predicted from
the sample immediately above in the same column. Formally
the modified residuals resulting from this RDPCM mode are:


=
=
otherwise ) , 1 ( ) , (
0 if ) , (
) , (
~
j i r j i r
i j i r
j i r
ver
(1)
The horizontal inter RDPCM mode is defined in a similar
way: samples in the first column in the block are left
unchanged, while all other samples are predicted from the
sample immediately on the left in the same row.
Along with the horizontal and vertical RDPCM options, the
no RDPCM case is tested at the encoder. This is typically a
coding choice for those residuals that are highly uncorrelated,
i.e. where no further spatial prediction is needed. To find the
best mode for the current residual block the sum of absolute
differences (SAD) distortion metric is computed for each mode
(i.e. horizontal, vertical or no RDPCM). The mode with
minimum SAD is selected as the best. Notice that the solution
with minimum distortion is used instead of the solution with
minimum coding rate, because the computational complexity
required to compute this rate would be too high, since for each
mode and level of granularity a CABAC encoding of the
residuals would be needed. Finally, in this implementation the
inter RDPCM mode is signaled to the decoder using CABAC
with the following binary representation: no RDPCM (0),
horizontal RDPCM (10) and vertical RDPCM (11). One
context for each block size and color component is used.
At the decoder side, when vertical RDPCM is selected, the
residuals r(i, j) to be added to the motion compensated
prediction are obtained as follows:

=
=
i
k
ver
j k r j i r
0
) , (
~
) , (
. (2)
For horizontal RDPCM, the summation is performed across the
current row.
362
C. Additional tools for inter RDPCM
As can be seen from Eq. (2), the reconstruction of a
residual in the i-th row depends on the previous i - 1 samples.
For large TU sizes (e.g. 3232) samples located at the
rightmost columns (bottom rows for horizontal RDPCM)
require a high number of additions before becoming available.
Therefore it increases the computational complexity and
dependency between samples which may not be acceptable in
some applications. Moreover, from the description of the
horizontal and vertical RDPCM given in Section III.B, it may
be noted that the samples in the first column (respectively the
first row) are not RDPCM predicted. These two observations
motivated the design of the two proposed prediction chunking
(PC) and hierarchical prediction (HP) tools.
The prediction chunking tool limits the residual DPCM
prediction to groups of samples with a specified length L,
denoted as chunking length. In this way the RDPCM process is
reset every L samples so that the number of operations per
sample at the decoder side is reduced. The vertical RDPCM
prediction when the PC tool is used is defined as follows:


=
=
otherwise ) , 1 ( ) , (
,... 2 , , 0 if ) , (
) , (
~
j i r j i r
L L i j i r
j i r
ver
(3)
At the decoder, the residuals can be reconstructed as follows:

=
=
i
L L i k
ver
j k r j i r
/
) , (
~
) , (
, (4)
where the operator returns the the largest integer smaller
than or equal to the argument than its argument. Equivalent
expressions for forward and inverse inter RDPCM can be
easily derived for the horizontal mode when using PC.
Once RDPCM is performed on a block, samples in the first
column and the first row for horizontal and vertical RDPCM,
respectively, are not predicted. Therefore it is beneficial to
exploit redundancy by performing prediction on these samples
in the direction orthogonal to the main RDPCM direction, as
shown in Figure 1. The HP tool performs a RDPCM along the
first column of samples when horizontal RDPCM is selected as
the best mode or along the first row for vertical RDPCM. For
the case of vertical RDPCM, the HP is defined as:


=
=
otherwise ) 1 , 0 ( ) , 0 (
0 if ) , 0 (
) , 0 (
~
j r j r
j j r
j r
ver
(5)
A similar formalization can be defined for horizontal RDPCM.


r
0,0
r
0,1
r
0,2
r
0,3
r
0,N-1
r
1,0
r
1,1
r
1,2
r
1,3
r
1,N-1
r
M-1,0
r
M-1,1
r
M-1,2
r
M-1,3
r
M-1,N-1
N
M

Figure 1: Hierarchical prediction (green line) - an additional step
is applied on the top row after vertical RDPCM (red lines).

IV. EXPERIMENTAL RESULTS
This section presents the evaluation of inter RDPCM
together with the PC and HP tools. First the test setup,
benchmarks and performance indicators are introduced,
followed by the experimental results.
A. Test material,coding conditions and benchmarking
The screen content considered by the JCT-VC (Class F [12]
and Class SC [13]) has been used in the experiments. Class F
content belongs to the JCT-VC common test conditions set.
The color space is the YCbCr ITU-R-BT.709 and the chroma
sampling is 4:2:0. Class SC content belongs to the HM-RExt
screen content coding test set.. The color space representation
is RGB and the chroma sampling is 4:4:4. All sequences have
been coded in lossless mode using the HM-RExt software
version (HM10-RExt-2.0). The same software has been used to
implement the proposed inter RPDMC, PC and HP tools. The
coding configurations are random access main (RA-main) and
low delay main (LD-main) with 8-bit internal representation
[13].
The compression efficiency and complexity associated with
the proposed lossless tools will be compared using the HM10-
RExt-2.0 codec as a benchmark. As performance indicators, the
bitrate reductions in percentage against the benchmark will be
used to measure the compression efficiency. The bitrate
reduction is denoted as Rate and negative values mean an
improvement with respect to the benchmark.
B. Experimental results and discussion
The results obtained are listed in Table 1. As may be
observed, the inter RDPCM method provides good average
bitrate reductions at test points for all levels of RDPCM
implementation in the coding hierarchy (CU, PU, TU).
Reductions of up to 16% are reported for LD-main in the TU
level implementation. From the presented results it may be also
observed that the finer the level of granularity where the inter
RDPCM is applied, the greater the bitrate reductions. This is
expected, since for a higher level of granularity (e.g. CU) a
given RDPCM mode may not be optimal for all the samples
inside the unit. Conversely, as the granularity gets finer at PU
and TU levels, more suitable combinations of RDPCM modes
can be used over the associated sub-partitions. The price to pay
for better bitrate reductions is the increased complexity which
gets its maximum values at TU level. However, it is interesting
to note that at the decoder side the complexity is unchanged
with respect to the benchmark at all levels.
The level of granularity which leads to the best trade-off
between coding efficiency and complexity is the PU level. For
this reason this implementation is considered to assess the
performance of the PC and HP tools. The performance for the
PC coding tool is listed again in Table 1. Since the PC tool
reduces the computational complexity, a coding efficiency loss
is expected. In particular up to 0.5% of bitrate reduction is
sacrificed to decrease the complexity. Conversely, when the
coding efficiency benefits from the application of the HP tool,
the average reduction improves by up to 8.8% for Class F.
Finally, to demonstrate the selection of the inter RDPCM tool,
the selection of the vertical and the horizontal RDPCM modes
for one frame of the Programming test sequence is shown in
363
Table 1: Measured bitrate reductions and complexity for the proposed lossless coding tools.

CU-based PU-based TU-based PC (PU-based) HP (PU based)
RA-Main LD-main RA-Main LD-main RA-Main LD-main RA-Main LD-main RA-Main LD-main
Rate [%] Rate [%] Rate [%] Rate [%] Rate [%] Rate [%] Rate [%] Rate [%] Rate [%] Rate [%]
Class F -5.0 -7.7 -5.1 -8.0 -5.4 -8.2 -4.3 -6.4 -5.5 -8.8
Class SC -3.2 -6.6 -3.7 -7.6 -3.9 -7.7 -3.5 -6.9 -3.8 -7.4
Average -4.1 -7.1 -4.4 -7.8 -4.7 -7.9 -3.9 -6.6 -4.6 -8.1
Enc. Time [%] 101 102 104 103 110 108 107 105 104 103
Dec. Time [%] 100 100 100 100 101 100 100 100 100 100

Figure 2: Second frame of the Programming sequence. Left original frame, Right inter RDPCM mode selection overlaid. Red blocks
horizontal RDPCM, Blue blocks vertical RDPCM


Figure 2. As may be noted the vertical RDPCM mode is
selected for blocks with sharp edges along the vertical
dimension and vice-versa for the horizontal mode.
V. CONCLUSIONS AND FUTURE WORK
In this paper the inter RDPCM coding tool has been
proposed to improve inter prediction in lossless screen content
coding. Moreover, two other tools for reducing the complexity
or increasing the compression efficiency have also been
designed. Experimental results show that the application of
inter RDPCM is suitable for screen content. Moreover, the
proposed PC and HP have proved to be effective in
complexity reduction and coding efficiency improvement,
respectively. Future work will involve the extension of the
inter RDPCM tool for the lossy coding scenario at very high
bitrates which typically lead to visually lossless compression.
REFERENCES
[1] T. Vermeir, Use cases and requirements for lossless and screen content
coding, JCTVC-M0172, 13
th
JCT-VC meeting, Incheon, KR, Apr. 2013.
[2] J. Sole, R. Joshi and M. Karczewicz, Requirements for wireless display
applications, JCTVC-M0315, 13
th
JCT-VC meeting, Incheon, KR, Apr.
2013.
[3] G. J. Sullivan, J.-R. Ohm, W.-J. Han and T. Wiegand, Overview of the
High Efficiency Video Coding (HEVC) standard, IEEE Trans. on
Circuits and Syst. for Video Technol., vol. 22, no. 12, pp. 1649-1668,
Dec. 2012.
[4] J.-R. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan and T. Wiegand,
Comparison of the coding efficiency of video coding standards
including High Efficiency Video Coding (HEVC), IEEE Trans. on
Circuits and Syst. for Video Technol., vol. 22, no. 12, pp. 1649-1668,
Dec. 2012.
[5] ITU-T, High Efficiency Video Coding, ITU-T H.265, Edition 1.0 (pre-
published), Apr. 2013.
[6] A. Gabriellini, D. Flynn; M. Mrak and T. Davies, Combined Intra-
Prediction for High-Efficiency Video Coding, IEEE J. of Sel. Topics in
Signal Processing. Vol. 5, no. 7; pp. 1282-1289, Nov. 2011.
[7] Y.-L. Lee; K.-H. Han and G.J. Sullivan, Improved lossless intra coding
for H.264/MPEG-4 AVC, IEEE Trans. on Image Processing, vol.15,
no.9, pp.2610,2615, Sept. 2006.
[8] S. Lee, I.-K. Kim and C. Kim, Residual DPCM for HEVC lossless
coding, JCTVC-M0079, 13
th
JCT-VC meeting, Incheon, KR, Apr. 2013.
[9] K.-H. Han, K. R. Rao and Y.-L. Lee, Residual DPCM about Motion
Compensated Residual Signal for H.264 Lossless Coding, IEICE
Trans. on Fundamentals of Electronics, Communications and Computer
Sciences,vol. E92., no. 5, pp. 1386-1389, 2009.
[10] M. Zhou; W. Gao; M. Jiang and H. Yu, HEVC Lossless Coding and
Improvements, IEEE Trans. On Circuits and Systems for Video
Technology, vol.22, no.12, pp.1839,1843, Dec. 2012.
[11] M. Zhou and M.. Budagavi, Experimental results on Test 3 and Test 4,
JCTVC-M0056, 13
th
JCT-VC meeting, Incheon, KR, Apr. 2013.
[12] F. Bossen, Common HM test conditions and software reference
configurations, JCTVC-L1100, 12
th
JCT-VC meeting, Geneva, CH,
Jan. 2013.
[13] W. Gao, M. Zhou, P. Amon and S. Lee, HEVC Range Extensions Core
Experiment 2 (RCE2): Intra Prediction for Lossless Coding, JCTVC-
L1122, 12
th
JCT-VC meeting, Geneva, CH, Jan. 2013

364

You might also like