H.265 High Efficiency Video Coding (HEVC) : Presented by

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29
At a glance
Powered by AI
The key takeaways are that H.265 is a video compression standard that aims to provide twice the compression of H.264 with similar quality. It is currently being developed by ISO and other standards bodies.

Some of the goals of H.265 include high coding efficiency (around half the bitrate of H.264), computational efficiency, loss/error robustness, and network friendliness.

Some of the features being tested in H.265 include coding tree blocks, prediction units, transform units, and adaptive motion vector resolution.

Presented By: Avishek Dookhun Rikesh Ramlochund Irfaan Bhojoo

H.265 HIGH EFFICIENCY VIDEO CODING (HEVC)

WHAT IS H.265?

Also known as HEVC (High Efficiency Video Coding ) or H.NGVC (Nextgeneration Video Coding)

Draft video compression standard, a successor to H.264/MPEG-4 AVC (Advanced Video Coding), currently under joint development by the ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG).
MPEG and VCEG have established a Joint Collaborative Team on Video Coding (JCT-VC) to develop the HEVC standard. It has sometimes been referred to as "H.265", since it is considered the successor of H.264, although this name is not commonly used within the standardization project. In MPEG, it is also sometimes known as "MPEG-H". However, the primary name used within the standardization project is HEVC. The HEVC is being tested through a software known as JM/Key Technical Area(KTA) and the test is known as HEVC Test Model (HM). Various tools and features are being tested so as to improve the HEVC and also meet the goals of the JCT-VC.

OVERVIEW OF H.265

HEVC aims to substantially improve coding efficiency compared to AVC High Profile, i.e. reduce bitrate requirements by half with comparable image quality, probably at the expense of increased computational complexity. Depending on the application requirements, HEVC should be able to trade off computational complexity, compression rate, robustness to errors and processing delay time. HEVC is targeted at next-generation HDTV displays and content capture systems which feature progressive scanned frame rates and display resolutions from QVGA (320x240) up to 1080p and Super Hi-Vision, as well as improved picture quality in terms of noise level, color gamut and dynamic range.

PROBABLE GOALS OF H.265


Currently, H.265 hasnt been formalized and Video Coding Expert Groups (VCEG) keeps seeking proposals and information regarding the possibility of a major gain in performance to justify the step from H.264 to H.265. Though the necessary scope of H.265 is yet to be determined, it is agreed that among the goals will be:

simplicity and back to basics approach high coding efficiency, e.g., two times compared with H.264 computational efficiency, considering both encoder and decoder loss/error robustness network friendliness

H.265 AND PREVIOUS VERSIONS

H.265 is the next best thing after H.264 when it comes to video compression standards, or at least this is where it aims. Below is a comparison of H.26x series of video codecs: complexity versus compression rates.

Figure 1

H.265 AND PREVIOUS VERSIONS


H.265 relies on the fact that processing power is increasing on our devices, and uses brute force (=more processing power) to achieve better compression. The interesting thing about H.265, is that it doesnt include any Scalable Video Coding (SVC - the extension to the H.264 standard) capabilities.

FEATURES TO TEST
Features
Coding Tree Block(CTB) Unit Definition Prediction Unit(PU) Transform Unit(TU)

Priority
1 1 1

Discuss

Motion Vector prediction for rectangular partition


Motion Representation Motion Vector prediction for geometric block partition Interpolation Methods Adaptive Motion Vector Resolution Adaptive reference sample smoothing Planar prediction Intra-Frame Prediction Angular prediction Arbitrary Directional Intra(ADI) Combined Intra Prediction(CIP)

1
2 1 1 2 1 1 2 1

FEATURES TO TEST CONTINUE


Features Large Transform(16x16, 32x32, 64x64) Rotational transform(ROT) Spatial Transform Mode Dependent Directional Transform for Intra-prediction residuals Quantization Luma filtering Deblocking filter Chroma filtering Planar mode filtering In-loop filtering Entropy Coding Low Complexity Entropy Coding High Coding Efficiency Entropy Coding Priority 1 2 1 1 1 1 1 1 1 1 Discuss

CODING TREE BLOCK


The coding unit (CTB) is defined as a basic unit which has a square shape. It is similar to H.264/AVCs macroblock and sub-macroblock. Main difference is that CTB have various sizes. All processing except frame-based loop filtering is performed on a CTB basis, including intra/inter prediction, transform, quantization and entropy coding. Two special terms are defined: the largest coding unit (LCTB) and the smallest coding unit (SCTB). For convenient implementation, LCTB size and SCTB size are limited to values which are a power of 2 and which are greater than or equal to 8. It is assumed that a picture consists of non-overlapped LCTBs. Since the CTB is restricted to be a square shape, the CTB structure within a LCTB can be expressed in a recursive tree representation adapted to the picture. That is, CTB is characterized by LCTB size and the hierarchical depth in the LCTB that the CTB belongs to.

Figure 2

INTERPOLATION METHODS
Reason to use Interpolation Motion-compensated prediction (MCP) removes the temporal redundancy in video signals and reduces the size of bitstreams significantly. With MCP, the pixels to be coded are predicted from the temporally neighboring ones, and only the prediction errors and the motion vectors (MV) are transmitted. However, due to the finite sampling rate, the actual position of the prediction in the neighboring frames may be out of the sampling grid, where the intensity is unknown, so the intensities of the positions in between the integer pixels, called sub-positions, must be interpolated and the resolution of MV is increased accordingly. Single pass switched Interpolation Filters with offsets (single pass SIFO) The 1/4th pixel motion positions are shown in the figure.

Filter set 0: This filter set uses high precision filtering with the filters shown in the following table with the exception of position g, where a non-separable filter is used.
{-1, 5, -12, 20,-40, 229, 76, -32, 16, -8, 4,-1 } (18 additions, 6 shifts) {-1, 8, -16, 24,-48, 161, 161, -48, 24, -16, 8,-1 } (15 additions, 4 shifts) {-1, 4, -8, 16, -32, 76, 229, -40, 20, -12, 5,-1 } (18 additions, 6 shifts)

The filter coefficients are scaled by 256 and the intermediate data is kept in higher precision. For position g, the following filter is used (followed by right shift by 7 bits): Filter set 1: This filter set uses 12 tap filters for both horizontal and vertical filtering. The filter coefficients are scaled by 256. A higher bit depth of filter coefficients leads to more accurate interpolation; for comparison, H.264/AVC uses 5 bits for filter coefficients. Also, the intermediate values are maintained at higher bit precision.
1/4 { 8, -32, 224, 72, -24, 8 } (8 additions, 4 shifts)
1/2 { 8, -40, 160, 160, -40, 8 } (6 additions, 3 shifts)

3/4

8, -24, 72, 224, -32,

8 } (8 additions, 4 shifts)

Table 2: 12-tap separable filters

ADAPTIVE MOTION VECTOR RESOLUTION

For each region in a motion partition, the motion accuracy can be adaptively chosen to be 1/4th pixel or 1/8th pixel. We will refer to this as adaptive motion vector resolution. The choice of motion vector resolution is signaled to the decoder. For each motion vector, a motion vector resolution flag is encoded. If the flag is zero, the motion vector precision is 1/4th otherwise 1/8th. If the flag is 1 and motion vector is nonzero, refinement information for this motion vector is sent which specifies 1/8th pel precision. The encoder always maintains the motion vector (MV) and MVD information at 1/8th pixel resolution. Then, the MV prediction for the current block is formed with 1/8th pixel accuracy. If the current block has only 1/4th pixel motion accuracy, the MV prediction is converted to 1/4th pixel accuracy. On the other hand if the current block has 1/8th pixel motion accuracy, the MVD is formed directly by subtracting the MV prediction from the motion vector for the current block. Once MVD is formed, if the current block has 1/4th pixel accuracy, for all the neighboring blocks used for determining the MVD contexts, the MVDs are converted to 1/4th pixel accuracy. Similar procedure is followed for 1/8th pixel accuracy.

INTRA-FRAME PREDICTION
For blocks of size 64x64 : 33 Directions (ADI+Planar) For blocks of size 32x32 : 33 Directions (ADI+Planar) For blocks of size 16x16 : 33 Directions (ADI+Planar) For blocks of size 8x8 : 33 directions (Angular+Planar) For blocks of size 4x4: 9 directions (AVC) PLANAR PREDICTION The planar prediction is designed to be able to reconstruct smooth image segments in a visually pleasing way and provides maximal continuity of the image plane at the macroblock borders. It is able to follow gradual changes of the pixel values by signalling a planar gradient for each macroblock coded in this mode.

PLANAR PREDICTION CONTINUE


Planar prediction of an 8x8 (chrominance) block. Bottom-right sample is signalled in the bitstream, rightmost and bottom samples are interpolated linearly, and the middle samples are interpolated bi-linearly.

ARBITRARY DIRECTIONAL INTRA (ADI) Arbitrary Directional Intra (ADI) generates prediction pixels by directional extrapolation or calculation using the nearest boundary pixels of the already decoded area. But in ADI, even boundary pixels from the left down region may be used as context pixels for prediction as depicted in below

Example of context pixels for ADI Whilst the 9 prediction modes are defined separately as Vertical, Horizontal, DC, Diagonal Down-Left, Diagonal Down-Right, Vertical-Right, Horizontal-Down, Vertical-Left, and Horizontal-Up in H.264/AVC, most of prediction modes in ADI are defined by integer pair information (dx, dy). The (dx, dy) pair represents the direction which each mode uses for context pixel extrapolation

SPATIALS TRANSFORMS
4.1 Large transform (16x16, 32x32, 64x64)
For smooth data, a large transform has several advantages such as better energy compaction and reduced quantization error. By contrast, H.264/AVC supports only 4x4 and 8x8 transform sizes. Larger transform sizes can also be chosen (or set) for each coding unit (blocks or partitions). Transform sizes larger than the prediction unit can also be supported. 3 additional sizes of transform are included: 16x16, 32x32 and 64x64. The transforms in this text are based on Chens fast DCT algorithm as it reduced the implementation complexity due to the regular butterfly structure. Moreover, it is readily extensible to larger transform sizes.

Figure 3 - Signal flow graph of Chens fast 16-point DCT transform

LARGER TRANSFORM CONTINUE


Figure 2-1 shows the signal flow graph of Chens fast factorization of a 16 point DCT transform. In this figure, multiplication constants are represented by sinusoidal functions of specific angles, requiring floating point operations. To solve this problem, we scale and approximate the factors by fixed precision using pre-defined values, which can be calculated by cost effective shift operations. Here, the aks are approximated values of cos(k*pi/32) for k =1,2,,15.
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15

63/64

62/64

61/64

59/64

56/64

53/64

49/64

45/64

40/64

35/64

30/64

24/64

18/64

12/64

6/64

Table 0 Approximated constants for 16 point transform This way ,complexity can be significantly reduced.

QUANTIZATION

The basic principle for quantization and de-quantization of coefficients for large transforms is the same as that used in H.264/AVC, i.e. a scalar quantizer with dead-zone. Currently most image and video coding systems and standards, such as MPEG1/2 and H264/AVC, use transform-based techniques followed by quantization and entropy coding. The key idea is that transforms de-correlate the signal and compact the energy of a block into a few coefficients, which still represent the signal rather accurately after quantization and de-quantization. Nevertheless, this quantization/de-quantization process needs to be carefully designed in order to have the best possible subjective and objective quality. In the encoder of H.264/AVC reference software, the scalar dead-zone quantization is adopted. In order to improve further the performance, other two adaptive quantization techniques are also introduced, which are both based on how to adjust the size of dead-zone and control the rounding behavior.

IN-LOOP FILTERING

The purpose of the adaptive loop filtering (ALF) process is to further reduce the distortion between the original picture and the reconstructed picture caused by complex lossy coding. Filters minimizing the distortion for both luma and chroma components are calculated using the Wiener filter approach. The Wiener filters purpose is to reduce the amount of noise present in a signal by comparison with an estimation of the desired noiseless signal. After filters are applied in a frame or coding units, filter coefficients are explicitly sent in the bitstream. This adaptive loop filter increases reconstructed picture quality as well as reference picture quality for the next picture coding.

LOW COMPLEXITY ENTROPY CODING


Encoding of a parameter/event using a VLC table is typically done in three steps: Convert the parameter/event value to a table index by using some enumeration scheme Use the table index to generate a code number through lookup in a sorting table.

LOW COMPLEXITY ENTROPY CODING


Note: The purpose of the sorting table is to assign code numbers according to increasing probability so that parameters/events with the high probability are assigned a code number with a low value. Use the code number to generate a binary codeword by lookup in the pre-determined VLC table. Example: Encoding and Decoding(next slides)

LOW COMPLEXITY ENTROPY CODING

Encoding of VLC

Decoding of VLC

ADVANTAGES OF H.265

HEVC is being developed using picture sizes ranging from WQVGA (432x240 16:9) to Ultra HDTV (7680x4320). 4K video is already being used for Digital Cinema, and many believe that Ultra HDTV resolution will become widely available by 2020.

HEVC will therefore significantly reduce bandwidth requirements for video conferencing and streaming, which can be used either to reduce bandwidth costs or to increase video resolution and quality.

ADVANTAGES OF H.265
The performance goal is that HEVC should provide 2x better video compression performance than H.264 high profile (that is, around half the bit rate for the same visual quality). By the time HEVC is available in products, those products will likely be running chipsets that are powerful enough to efficiently support the processing requirements.

ISSUES WITH H.265


Not a technology that is currently available. Draft standard by mid 2012 and ratification in 2013 Backward/forward compatibility is not assumed to be required for H.265, as H.265 is a brand new standard instead of an extension of H.264. This performance comes with a cost of computational complexity and requires more processing power.

IS H.265 PRACTICAL?
Points in deciding practical usage of H.265: Requires high processing power v/s Devices getting more powerful Will come to users in 2013 v/s Will another codec take its place meanwhile Bandwidth v/s getting better quality over the same bandwidth due to compression

H.264 RELEASE SCHEDULE


The timescale for completing the H.265standard is as follows: February 2012: Committee Draft (complete draft of standard) July 2012: Draft International Standard January 2013: Final Draft International Standard (ready to be ratified as a Standard)

REFERENCES

wikipedia.org/wiki/High_Efficiency_Vi

deo_Coding www.H265.net www.vcodex.com/h265.html

You might also like