Image and Video Compression

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

IT T64 INFORMATION CODING TECHNIQUES J.

VEERENDESWARI
UNIT III
Image and video compression: Quantization-JPEG standards-motion
compensation-MPEG-1- MPEG-2-MPEG-4, H.26x standards.
In this particular unit we are going to deal with the compression technique for image &
video information also this unit is organized with process of explanation.
1. Quantization (going to discuss uniform & non uniform)
2. JPEG standard (joint picture expert group)
i. Features of JPEG

ii. Modes of operation

iii. Base line & its block diagram

iv. Image & its block diagram

v. JPEG decoding block diagram

vi. JPEG hierarchy diagram

3. Motion compensation
 Need for compensation

 Compensation techniques

 Interpolative technique

 Predictive technique

 Transform coding technique

4. MPEG standard
i. MPEG1 standard

 Introduction

 Features

 Video format

 Data structure & compression modes

 Intra & interframe compensation modes

 Encoder & decoder


ii. MPEG2 standard

 Introduction

 Macro block

 Interleast video

 Scalable extension

 Other improvements

 Overview profile & levels

iii. MPEG4 standard

 Introduction

 MPEG4 system

 Major components & block diagram

 Logical structure of a scene

 Natural 2D motion video

 Block diagram of base coders

 Coding tools & algorithm

 Error resilence

 Synthetic images

Quantization:
It is a process of representing continous time valid signal into continous valid samples with a
finite no.of states termed discrete time valued signal.
(i.e) CTS to DTS.
If each sample is quantized independently then the quantization process is called scalar
quantization which is
Q(s)= ri if S € [di-1,d] , i=1,2,….L
L is the no of output states
On the other hand the vector Quantization refers to representing a set of vector each formed by
continous valued sample with finite no. of vector states.
Note:
In image compression scalar or vector Quantization is applied to transform domain image
representation
Any Quantizer performance Quantified by a distortion measure D which is a function of
quantization error
D=f(e) where e=S-s
Generally there are two types of Quantization
1) Uniform Quantization

2) Non uniform Quantization

Non uniform Quantization:


This is determined by the Lloyd-max Quantifier equation has
E{(S-ri)2} = ∑li=1 il-1∫il (S-ri)2 p(s) ds
Where p(s)ds is the probability error function & this refer to ri & di which defines by the
minimum value
ri = [di-l∫dl sp(s) ds] / [ di-L∫dl p(s) ds] ,1<= i <= L
di = (ri+ri+1)/2
These three equation concerned with solution of non-linear equation requires and iterative
method of algorithm called Neuton. Raphson algorithm the resulting decision & reconstruction
level are not equally spaced & hence we have non-uniform Quantizer except when S has a
uniform probability density function over a period of ri to di ,p(s) ds.
Uniform Quantization:
 Uniform Quantizer is a Quantizer all reconstructions level are equally spaced.

ri+1-ri=θ , 1<= i <= L-1


where θ is a constant called step size the Layd Quantization becomes a uniform Quantizer
when p(s) is uniformly distributed over on interval a,b
p(s) = {10/(B-A) A< S<Botherwise
designing a uniform Quantizer is a different task where
θ = (b-a)/N ,
di=A+iθ , V 0<i<L
ri = di+1+θ/2, V 1<i<L
Then uniform Quantizer can also be designed if P(s) represented has a Laplase N distribution.
To avoid confusion even for the simple notation a change of variable L can also be considered
as with the representation of levels
.
We can construct the relationship between L-even and L-odd has a symantic relationship

The mean square Quantization noise can be expressed has


The Quantization levels L-even and L-odd can be expressed with stair case step size as

shown in figure.
JPEG:(Joint photographic Expert Group)
JPEG modes of operation:
1. Sequential mode(Base Line)

2. Lossless mode

3. Progressive mode

4. Hierarchial mode

Steps involved in sequential mode


1. Image/Block preparation

2. DCT computation

3. Quantization

4. Entropy coding

 Vector coding

 Differential coding

 Run-length coding

 Huffman coding

 Frame building

 Decoding
Block diagram of JPEG Baseline mode

In the late 70’s and early 80’s various types of image compression schemes introduced with
algorithm the two worldwide organization CCITTR and ISO worked actively to propose new
algorithm for image compression there one of the standard is JPEG standard which is a lossy
compression scheme.
It has the following features
1. Resolution Independent

2. High precesion

3. No absolute bit rate length

4. Luminance and chrominance image

5. Extensible

Modes of operations:
This standard defines the range of different compression modes. Each mode is intended
for use in a particular application domain.
The modes are Sequential mode, Lossless mode, Progressive mode, Hierarchical mode.
The JPEG is not a complete architecture for image exchange which gives the data streams
of images for a decoder needs to compress it.

Note: what is JFIF?


(JPEG file interchange format) which enables the JPEG bit streams to be exchanged between
a wide variety of platform and application.
JPEG base line:
This is also called sequential algorithm. It support only 8 bit images it uses only
Huffman coding for compression.
There are five main stages associated with this has
1. Image/block preparation

2. DCT computation

3. Quantization

4. Entropy coding

a) Vector coding

b) Differtial

c) Run-length coding

d) Huffman coding

5.Frame building
Image/block preparation:
 Source image as 2-d matrix of pixel values.

 R,G,B format requires three matrices one each for R,G,B Quantized values.

 In Y,U,V representation the U and V matrices can be half as small as the Y


matrices

 Source image matrix is divided into blocks of 8*8 sub matrices.

 Smaller block size helps DCT computation and individual blocks are equally fed
to the DCT which transforms each block separately.
The image preparation of 8*8 matrix will look like this

We having various image format has monochrome, RGB format and chrominous format
after the selection of any of the image format the block preparation will be carried to forward to
the DCT(Discrete Cosine Transform)
Has DCT computation is a time consuming process the total matrix is divided into set of
smaller 8*8 sub matrix.

DCT computation:
Is the mathematical transformation includes FFT (Fast Fourier Transforms) they take the
signal has i/p and transform it into another type of transformation. It takes a set of points from
the special
Domain and converted into an identical representation of frequency of spectral domain.
The steps are:
 Each pixel value in the 2D matrix is Quantized using 8bits which produces a value in the
range of 0 to 225 for the intensity/luminance values and the of -128to 127 for the
chrominance values. All values are shifted to the range of -128 to 127 before computing
DCT.

 All 64 values in input matrix contribute to each entry in the transformed matrix.

 The other 63 values are called the AC coefficients and have a frequency coefficient
associated with them.
 The value in the location F[0,0] of the transformed matrix is called the DC coefficient
and is the average of all 64 values in the matrix.

 Spatial frequency coefficients increase as we move from left or right (horizontal) or from
top to bottom(vertically) Low spatial frequencies are clustered in the left top corner.

The formula for DCT computation is given below which perform on NXN square matrix of
pixel values & yield on NXN frequency values.
This is a lossless transformation which does not perform compression in the OP occupies
more space to make it compressed image we can go for quantization.

Quantization:
It consists of the following steps.
 The human eye responds to the DC coefficient and the lower spatial frequency
coefficients.

 If the magnitude of a higher frequency coefficient is below a certain threshold, the eye
will not detect it.

 Set the frequency coefficient in the transformed matrix whose amplitudes are lees than a
defind threshold to zero.

 During quantization the size of DC & AC coefficient are reduced.

 Division operation is performed using the predefined threshold value as the division.

This quantization rounding the coefficient into nearest integer value this is being
performed by a quantization table.
Quantization table:
 Threshold values vary for each of the 64 DCT coefficient and are held in a 2-D matrix.

 Trade off between the level of compression required and the info loss that is acceptable.

 Two default quantization tables one of the luminance co efficient & other for the
chromance coefficients & customized tables may be used.

Entropy coding:
 Vectoring -2D matrix of quantized DCT coefficient are represented in the form of a
single dimensional vector.

 After quantization, most of the high frequency coefficients are zero.

 To exploit the no. of zero a zigzag scans of the matrix is used.

 zigzag scans allows all the DC coefficient & lower frequency AC coefficient to be
scanned first.
 DC are encoded using differential encoding & AC coefficient are encoded using run-
length coding, huffman coding is used to encode both after that.

Differential encoding:
 DC coefficient is largest in transformed matrix.

 DC coefficient varies slowly from one block to the next.

 Only changes is, the values of DC coefficient is encoded & no.

 The difference values are encoded in the form (SSS value) where SSS field indicate
no. of bits needed to encode the value field indicate binary form.

Run - length encoding:


 63 values of AC coefficient.

 Long strings of zero because of the zigzag scan.

 Each AC coefficient encoded has a pair of values (skip, value) skip indicates the no. of
zeros in the run and values is the next non-zero coefficient.

Huffman encoding:
 Long strings of binary digits replaced by shorter code words.

 Prefix property of the huffman code words enable decoding the encoded bit stream
unambiguously.

Frame building:
 Encapsulates the information relating to an encoded image for remote computer to get
encoded image the JPEG includes a definition of the total bit stream which known as
frame which is outlined in the diagram.

DECODING:
Jpeg decoder is made up of no: of stages which are simply the corresponding
decoder section of those used in the encoder hence the decoding function is similar to perform in
the encoding function
JPEG PROGRESSIVE MODE:
JPEG HEIRARCHICAL MODE: (REFER BOOK)
NOTES :
The remaining modes of JPEG standard
 Progressive mode

 Hierarchical mode ( pg:3.20-3.24)

 Lossless mode

MOTION COMPENSATION: (VIDEO STREAM INFORMATION)


Need for compression of video:
1. Real time and non real time:
The main objective of this capturing, compressing decompression, playing back in real
time with no delay, for this the requirement is that to have sufficient rate and make sure that
there is no jerky motion
Similarly the other approach are symmetric, asymmetric requirement the factor
concerned with compression ratio
The process of lossy and lossless compression interframe and intraframe (ie) discrete picture and
preductive approach picture . All this factor are important for compressing the video images.
Motion compensation is very important concerned with a case of a video sequences where
nothing is moving in the sea. Each video sequences converted into frames and each frames of the
video should be exactly the same as the previous one thus the video sequences is viewed has the
real time video by the reception of continuous still picture has frames
For the movement of the picture has still images one need to consider
 COLOUR RESOLUTION:

Which refers no: of colors displayed at any one time also concerned
with color format of RGB,YUB.

 SPACIAL RESOLUTION:

Which deals with the size of the picture .Thus reducing the amount of data needed
to reproduce video saved storage space increase accessing speed as well as enable us to
view the video in the digital manner.

There are various motion compensation technique are available

 Interpolative technique

 Predictive technique

 Transform coding technique

In interpolative aims to send a subset of the picture and interpolate to reconstruct the
information

It is particularly used for motion sequences

 The predictive technique use with differential PCM and ADOPTIVE PCM

 The transform coding converts data into an alternate form which is more
convenient for some particular purpose it follows the principle of DCT has the
entropy coding

There are various motion techniques standard termed

 MPEG-1

 MPEG-2

 MPEG-4

This MPEG activity started at 1988 and defined with various algorithm and
simulation models and completed by 1990 MPEG1was formally approved by IS
by late 1992.

MPEG1 FEATURES:
It is a generic standard standardized the syntax for the representation of encoded bit
stream and method of decoding. This operation supports
1. Motion estimation
2. Motion compensation and prediction

3. DCT

4. Quantization

5. Any kind of coding

For defining data stream does not follow any standard instead it has flexibility for data
streaming.

SPECIFIC APPLICATIONAREA:-
 Provide random access to video

 Fast forward and reverse operation available

 Reasonable coding and decoding with minimum delay

Input video format of MPEG1:


MPEG1 is a progressive video (i.e) which is not a interlaced video. It has the bit rate of
1.5mbps.

The input video first converted into a format of MPEG1 standard called SIF where the
Luminance channel is 352 pixels and 240 lines and 30 frames per second the hardware
implementation of MPEG1 standard is

Horizontal picture size < = 768 pixels

Vertical picture size <= 576 lines

No:of macro blocks <= 396

No:of macro blocks picture rate <= 396*25=9900

Picture rate <= 30 picture/s

VBV buffer size >= 2,621,440bits

Bit rate <= ,856,000 bits/s

DATA STRUCTURES AND COMPRESSION MODES:


It follows a hierarchal data structure. It consists of 6 layers
(i) Sequences are formed by several group of pixels.
(ii) Group of pictures made up of pictures.

(iii) Picture consists of slices, generally there are 4 pictures as:

I-picture,B-picture,P-picture,D-picture

(iv) Slices consists of Macro blocks

(v) Composition of micro blocks gives the standard.

(vi) Blocks are 8*8 picture arrays.

P-FRAME:

If the block can be skipped, we just send a “skip” code. Otherwise, we compare
the number of total bits of inter and intra coding, choose the more efficient one.

B-FRAME:

Comparison of three methods of encoding:

MPEG deals with two intra frame compression mode and inter frame
compression mode.

In intra frame we have the process of Quantization and the coding with run
length.

Inter frame has a prediction model with DCT encoding approach.

The forward prediction is called P-Picture, the bi-direction prediction is B-


picture.These two prediction concerned with macro box. The inter frame deals with
quantization and coding approach.
MPEG encoder:

(1) In summary MPEG I performs the following steps decide the labels I,P,B pictures of
GOP.

(2) Estimation of motion vector for each macro block in P and B pictures.

(3) Selection of compression needs whether Inter and Intra.

(4) Setting the Quantization scale and Quantization and coding algorithm.

NOTE:

Intra frame compression:

 Correlation/compression within a frame.

 Based on “baseline” JPEG compression standard.

Inter frame compression:

 Correlation/compression between like frames.

 Based on #261 compression standard.


MPEG 2 STANDARD:

The quality of MPEG 1 compression video at 1.5 Mbps has found that not acceptable for most
entertainment based application thus MPEG 2 introduced has a competable extension of
MPEG 1 to serve a wide range of application at various bit rate 2 to 20 Mbps. It allows for
interlaced input, high definition input, sub sampling input as well as it offers a scalable bit
stream. It provides improved Quantization and coding.

You might also like