Fast Coding Quad-Tree Decisions Using Prediction Residuals Statistics For High Efficiency Video Coding (HEVC)
Fast Coding Quad-Tree Decisions Using Prediction Residuals Statistics For High Efficiency Video Coding (HEVC)
Fast Coding Quad-Tree Decisions Using Prediction Residuals Statistics For High Efficiency Video Coding (HEVC)
1, MARCH 2016
Abstract—High Efficiency Video Coding (HEVC) is the latest video standard [3], [4], the HEVC standard approximately doubles data
coding standard to meet market demands for real-time high quality compression rate for equal perceptual video quality [5]. Designed to
video codecs. As compared to its predecessor H.264/AVC, HEVC can
address increased video resolutions, the attractive compression ben-
achieve significant compression gains but with higher encoding com-
plexity. Therefore, for real-time applications, significant encoding time efits offered by HEVC will ease the bandwidth bottleneck for the
reduction is still necessary. In the HEVC test model, the large number of deployment of HD and beyond HD contents.
coding quad-tree decisions to be tested during rate-distortion optimiza- Nonetheless, HEVC is reported to be several times more complex
tion would result in high encoding time. Hence, we propose a method to than H.264/AVC [2] and particularly suffers from high encoding time.
reduce the high encoding time by pruning the coding quad-trees using
prediction residuals statistics. Experimental results from HM16.3-based The high encoding time of HEVC is attributed to various new cod-
implementations show that the proposed residual-based pruning method ing tools and features that are introduced to provide increased coding
can reduce encoding time by an average of about 44% with an average flexibility. For instance, HEVC employs a flexible hierarchical pic-
of about 1.0% coding loss. ture partitioning structure that enables the use of large and multiple
Index Terms—High Efficiency Video Coding (HEVC), Coding sizes of processing units. HEVC also employs more comprehensive
Quad-Tree. intra prediction, inter prediction, transform and quantization, a new
loop filter, and an enhanced version of entropy coding. Such sophis-
ticated tools come with performance improvement but at the expense
I. I NTRODUCTION
of higher coding complexity, and this complexity would consider-
ECENT advancements in integrated circuits and multimedia
R technology have made consumer mobile devices ubiquitous,
allowing video contents to be broadcasted/transmitted anytime and
ably hamper its usage particularly in applications where real-time is
required.
In this paper, we propose a method to reduce the high encoding
anywhere. This directly increases market demands for real-time high time of HEVC, by pruning its coding quad-trees using prediction
quality video codecs. For instance, mobile video calling and con- residuals statistics. The proposed residual-based pruning method
ferencing applications require consumer mobile devices to encode introduces little modifications and insignificant computational com-
and decode video contents in real-time. Furthermore, online social plexity overheads to the existing encoder, but can effectively reduce
networking and broadcasting platforms have been extending their redundant mode and partition checks.
capabilities to allow consumers to live stream events using mobile This paper is organized as follows. First, an overview of the HEVC
devices. Even commercial broadcasting companies have been relying picture partitioning structure is provided in Section II. Then, prior
on small consumer grade devices for live news updates from places works on fast encoding methods for coding quad-tree decisions are
with accessibility constraints. As these devices are being equipped discussed in Section III. Subsequently, our proposed method is pre-
with increasingly high resolution content capture systems, consumers sented in Section IV, and the experimental results are shown in
will have the capability to produce higher quality video contents. In Section V. Finally, the conclusion is discussed in Section VI.
order for the contents to be readily available for consumption, there is
a need to have high performance video codecs that is able to stream
high quality video contents within similar bandwidth and time con- II. OVERVIEW OF HEVC P ICTURE PARTITIONING S TRUCTURE
straints. Nonetheless, the video codecs must remain low in complexity As shown in Fig. 1, like its predecessor standards such as MPEG-4
to run effectively on these low power devices with smaller integrated and H.264/AVC, HEVC adopts the conventional approach of using
circuits. temporal and spatial prediction, followed by transform coding of the
High Efficiency Video Coding (HEVC) [1], [2] is the lat- prediction residuals and entropy coding of the quantized transform
est video coding standard developed by the Joint Collaborative coefficients with other coding parameters. Nonetheless, HEVC uses
Team - Video Coding (JCT-VC), which comprises the ITU-T Video a flexible hierarchical picture partitioning structure to enable large
Coding Experts Group (VCEG) and the ISO/IEC Moving Pictures and multiple sizes of coding units (CUs), prediction units (PUs), and
Experts Group (MPEG). As compared to the existing H.264/AVC transform units (TUs) [2].
A CU defines a region sharing the same prediction mode (i.e.,
Manuscript received June 19, 2015; revised October 11, 2015; accepted
October 21, 2015. Date of publication January 12, 2016; date of current SKIP, INTER, or INTRA), a PU defines a region sharing the same
version March 2, 2016. (Corresponding author: Susanto Rahardja.) prediction information (i.e., intra prediction modes, motion parame-
H. L. Tan is with the Institute for Infocomm Research, A*STAR, ters, etc.), and a TU defines a region sharing the same transformation
Singapore 138632, and also with the National University of Singapore, and quantization. As shown in Fig. 2, in a coding quad-tree repre-
Singapore 117583 (e-mail: [email protected]).
C. C. Ko is with the National University of Singapore, Singapore 117583 sentation, a picture is first divided into non-overlapping coding tree
(e-mail: [email protected]). units (CTUs), which can then be recursively divided into smaller
S. Rahardja is with Northwestern Polytechnical University, Xi’an, CUs. A leaf CU, which is a CU that is not divided into sub-CUs,
Shaanxi, 710072, P.R. China, and also with the National University of can either be SKIP, INTER or INTRA coded. The CTU size is 64×64
Singapore, Singapore 117583 (e-mail: [email protected]).
Color versions of one or more of the figures in this paper are available
(depth 0), and the minimum CU size is 8×8 (depth 3) [6]. An INTER
online at http://ieeexplore.ieee.org. or INTRA coded CU can then be further divided into PUs of various
Digital Object Identifier 10.1109/TBC.2015.2505406 partitions. The possible PU partitions for a CU of size 2N × 2N are
0018-9316 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
IEEE TRANSACTIONS ON BROADCASTING, VOL. 62, NO. 1, MARCH 2016 129
TABLE I
C ODING LOSSES (BD-R ATE ) AND E NCODING T IMES
(E NC T IME ) FOR R ANDOM ACCESS (RA) S ETTING
W ITH M AIN AND M AIN 10 C ONFIGURATIONS
TABLE II
C ODING LOSSES (BD-R ATE ) AND E NCODING T IMES (E NC T IME ) FOR
L OW D ELAY (LD) S ETTING W ITH M AIN AND M AIN 10
C ONFIGURATIONS
TABLE III
C ODING L OSSES (L UMA BD-R ATE ) AND E NCODING T IMES (E NC T IME ) FOR A LL C OMBINATIONS OF R ANDOM
ACCESS (RA) AND L OW D ELAY (LD) S ETTINGS W ITH M AIN AND M AIN 10 C ONFIGURATIONS
VI. C ONCLUSION
HEVC is important to the economical adoption of the emerg-
ing high quality digital visual content formats. Nonetheless, further
encoding time reduction is necessary for its use in real-time appli-
cations. In this paper, we proposed a method for reducing the high
encoding time of HEVC, by pruning its coding quad-trees using pre-
diction residuals statistics. Our experimental results show that the
proposed residual-based pruning method effectively reduces redun-
dant mode and partition checks; encoding time is reduced by an
average of about 44% with an average of about 1.0% coding loss. By
using only simple second order prediction residuals statistics from
a current CU independently of its spatially and temporally neigh-
bouring CUs, the proposed method introduces little modifications
and insignificant computational complexity overheads to the existing
encoder.
In the future, we will investigate ways for reducing unneces-
sary symmetric motion partition (SMP) and asymmetric motion
partition (AMP) checks.
Fig. 5. Plot of coding loss vs encoding time for various fast encoding methods
under Random Access (RA) Main and Low Delay (LD) Main configurations.
ACKNOWLEDGMENT
The main author would like to thank Dr. Chuohao Yeo and
To better understand the encoding time overheads of our method, Dr. Yih Han Tan for their suggestions.
we turn on the prediction residuals statistics analysis algorithm
but turn off the pruning algorithm at every CU. The encoding
time increase is close to 0%, suggesting that the prediction resid-
R EFERENCES
uals statistics analysis algorithm incurs insignificant encoding time
overhead. [1] High Efficiency Video Coding, document ITU-T Rec. H.265, Oct. 2014.
We now discuss other prior fast encoding methods which are [2] F. Bossen, B. Bross, K. Suhring, and D. Flynn, “HEVC complexity
and implementation analysis,” IEEE Trans. Circuits Syst. Video Technol.,
not present in the reference software. As these prior methods are
vol. 22, no. 12, pp. 1685–1696, Dec. 2012.
proposed based on earlier HM versions and are evaluated using dif- [3] Advanced Video Coding for Generic Audio-Visual Services, document
ferent test conditions, the performance-complexity operating points ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), ITU-T and ISO/IEC
of these prior methods may not be directly comparable to the JTC 1, May 2003.
performance-complexity operating points of our proposed method. [4] I. E. Richardson, H.264 and MPEG-4 Video Compression: Video Coding
Nonetheless, as the source codes of these prior methods are not for Next-Generation Multimedia. Chichester, U.K.: Wiley, 2004.
[5] J. Vanne, M. Viitanen, T. D. Hamalainen, and A. Hallapuro,
publicly available, evaluation of these prior methods on HM16.3 “Comparative rate-distortion-complexity analysis of HEVC and AVC
is not viable. In general, these prior methods reported about 40% video codecs,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12,
encoding time reduction, with the Bayesian method [23] and the pp. 1885–1898, Dec. 2012.
IEEE TRANSACTIONS ON BROADCASTING, VOL. 62, NO. 1, MARCH 2016 133
[6] F. Bossen, Common HM Test Conditions and Software Reference [17] K. Choi and E. S. Jang, “Fast coding unit decision method based
Configurations, document JCTVC-L1100, JCT-VC, Geneva, on coding tree pruning for high efficiency video coding,” Opt.
Switzerland, Jan. 2013. Eng., vol. 51, no. 3, 2012, Art. ID 030502. [Online]. Available:
[7] A. Ortega and K. Ramchandran, “Rate-distortion methods for image http://dx.doi.org/10.1117/1.OE.51.3.030502
and video compression,” IEEE Signal Process. Mag., vol. 15, no. 6, [18] K. McCann, B. Bross, S.-I. Sekiguchi, and W.-J. Han, High Efficiency
pp. 23–50, Nov. 1998. Video Coding (HEVC) Test Model 2 (HM 2) Encoder Description,
[8] G. J. Sullivan and T. Wiegand, “Rate-distortion optimization for video document JCTVC-D502, JCT-VC, Daegu, Korea, Oct. 2010.
compression,” IEEE Signal Process. Mag., vol. 15, no. 6, pp. 74–90, [19] J. Kim, S. Jeong, S. Cho, and J. S. Choi, “Adaptive coding unit early
Nov. 1998. termination algorithm for HEVC,” in Proc. IEEE Int. Conf. Consum.
[9] C. Grecos and M. Y. Yang, “Fast inter mode prediction for P slices in Electron. (ICCE), Las Vegas, NV, USA, Jan. 2012, pp. 261–262.
the H264 video coding standard,” IEEE Trans. Broadcast., vol. 51, no. 2, [20] H. L. Tan, F. Liu, Y. H. Tan, and C. Yeo, “On fast coding tree block
pp. 256–263, Jun. 2005. and mode decision for high-efficiency video coding (HEVC),” in Proc.
[10] J. Hou, S. Wan, Z. Ma, and L.-P. Chau, “Consistent video quality control IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Kyoto, Japan,
in scalable video coding using dependent distortion quantization model,” Mar. 2012, pp. 825–828.
IEEE Trans. Broadcast., vol. 59, no. 4, pp. 717–724, Dec. 2013. [21] J. Leng, L. Sun, T. Ikenaga, and S. Sakaida, “Content based hierarchi-
[11] Y. Wang, L.-P. Chau, and K.-H. Yap, “Bit-rate allocation for broadcast- cal fast coding unit decision algorithm for HEVC,” in Proc. Int. Conf.
ing of scalable video over wireless networks,” IEEE Trans. Broadcast., Multimedia Signal Process. (CMSP), vol. 1. Guilin, China, May 2011,
vol. 56, no. 3, pp. 288–295, Sep. 2010. pp. 56–59.
[12] K. McCann et al., High Efficiency Video Coding (HEVC) Test Model [22] L. Shen, Z. Liu, X. Zhang, W. Zhao, and Z. Zhang, “An effective CU
16 (HM16) Encoder Description, document JCTVC-R1002, JCTVC, size decision method for HEVC encoders,” IEEE Trans. Multimedia,
Sapporo, Japan, Jul. 2014. vol. 15, no. 2, pp. 465–470, Feb. 2013.
[13] J. Kim, J. Yang, K. Won, and B. Jeon, “Early determination of mode [23] X. Shen, L. Yu, and J. Chen, “Fast coding unit size selection for HEVC
decision for HEVC,” in Proc. Pict. Coding Symp. (PCS), Kraków, based on Bayesian decision rule,” in Proc. Pict. Coding Symp. (PCS),
Poland, May 2012, pp. 449–452. Kraków, Poland, May 2012, pp. 453–456.
[14] J. Yang, J. Kim, K. Won, H. Lee, and B. Jeon, Early SKIP Detection [24] Y. H. Tan, C. Yeo, H. L. Tan, and Z. Li, “On residual quad-tree cod-
for HEVC, document JCTVC-G543, JCT-VC, Geneva, Switzerland, ing in HEVC,” in Proc. IEEE 13th Int. Workshop Multimedia Signal
Nov. 2011. Process. (MMSP), Hangzhou, China, Oct. 2011, pp. 1–4.
[15] R. H. Gweon, Y.-L. Lee, and J. Lim, Early Termination of CU Encoding [25] H. R. Wu and K. R. Rao, Digital Video Image Quality and Perceptual
to Reduce HEVC Complexity, document JCTVC-F045, JCT-VC, Turin, Coding (Signal Processing and Communications). Boca Raton, FL,
Italy, Jul. 2011. USA: CRC Press, 2005.
[16] K. Choi, S. H. Park, and E. S. Jang, Coding Tree Pruning Based [26] G. Bjontegaard, Calculation of Average PSNR Differences Between RD
CU Early Termination, document JCTV-F092, JCT-VC, Turin, Italy, Curves, document VCEG-M33, ITU-T Q6/SG16, Austin, TX, USA,
Jul. 2011. Apr. 2001.