l3 Mpeg

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Video - Basics September, 2000

Multimedia Systems - Image & Video Capture

„ An image is captured when a camera scans a scene
‹ Colour => Red (R), Green (G) and Blue (B) array of digital samples
‹ Density of samples (pixels) gives resolution
„ A video is captured when a camera scans a scene at
Joemon Jose multiple time instants
www.dcs.gla.ac.uk/~jj/teaching/demms4/ „ Each sample is called a frame giving rise to a frame rate
(frames/sec) measured in Hz
Tuesday, 15th January 2008 ‹ TV (full motion video) is 25Hz
‹ Mobile video telephony is 8-15 Hz … jerky

15/01/2008 video 2

Image Capture Image Data (RGB)

„ Colour still image:
‹ 420 x 315 pixels, 8
bits/pixel = 387KB

Red Green Blue

8 bits: 0-255



15/01/2008 video 3 15/01/2008 video (R,G,B)=(204,153 205) 4

Video - Basics September, 2000

Video Technology:
generating a colour Human Visual Perception
frame buffer „ Mixing three primary colours in varying proportions, the
(2 D array of
24 bit values) colour guns phosphor dots perception of different colours can be created
on display
red „ Human eye build up of
‹ Cones to perceive colour
‹ By exciting retina using different intensities of the three
<128, 128, 255> green primary colours, the same colour may be perceived by the
RGB value
What brain even if its unique wavelength is not present.
8 bits per colour
blue see

15/01/2008 video 6
15/01/2008 video

Human Information Colour

Identical colour combinations can cause different
Colour is a visual feature Presence and distributions of

colour sensation under different conditions „ „

which is immediately colours induce sensations and

Likewise two different colour can be perceived conveys meanings in the

identical … observer according to specific

„ the human eye & brain rules
‹ Interpolation „ Salient chromatic „ Representing colour on digital
images and reproducing
‹ Pictures and events that can still be identified as properties are captured
separate accurately on output devices
‹ Colour interaction in the brain are not at all straightforward

„ Adaptation „ Colour can add great „ Distances in colour space

value to an image should correspond to human
‹ General-brightness adaptation perceptual distance
‹ Lateral adaptation
‹ Chromatic adaptation
15/01/2008 video 15/01/2008 video 8
Video - Basics September, 2000

Colour Space Representation of Colour Stimuli

„ To deal with colour we need to quantify it in some way „ Points in three „ Hardware-oriented
‹ gives us the notion of colour space or domain dimensional space models
„ Calorimetric models ‹ RGB, CMY, YIQ

„ Hierarchy of colour sets ‹ CIE Chromaticity diagram „ User-oriented models

Physiologically inspired HLS, HSV, HSB
‹ Perceivable by human beings „ ‹

‹ Displayed on a monitor screen

‹ Calculated and stored in a frame memory

„ Psychological models
‹ HSV,

15/01/2008 video 10
15/01/2008 video

Video Technology: Video Technology:

representing colour Colour Models: RGB

„ monochrome „ RGB = Red Green Blue

‹ bilevel
„ directly modelled in device (i.e., corresponds to colour
) one bit/pixel: 0 = black, 1 = white
guns in display)
‹ grey-scale
„ easy to implement
) e.g., 8 bits/pixel = 256 intensities

„ colour
‹ value for each colour gun „ not based on visual (perceived) colours
‹ no of bits gives colour range „ not perceptually uniform
) e.g., 24 bits = 8 bits for red, 8 bits for green, 8 bits for
) colour depth

15/01/2008 video 12
15/01/2008 video
Video - Basics September, 2000

Video Technology: Video Technology:

Colour Models: RGB Colour Space Colour Models: RGB Colour Space

Cyan -,-,z
Blue Blue
(0,0,1) (0,1,1)
Magenta Magenta
White White
Black Green (0,0,0) Green

Yellow Red Yellow

(1,0,0) (1,1,0)

15/01/2008 video 14
15/01/2008 video

Video Technology: Video Technology:

Colour Models: RGB Colour Models: HSV

„ Colour is labeled as a relative weights of three primary „ HSV = hue, saturation, value (intensity)
colours, in an additive system using the primaries Red, „ “painter’s model”
Green, Blue „ better model for representing colours as we see them
„ It is perceptually non-linear space (“I want a bright highly saturated apple green.”)
‹ Equal distances in the space do not necessarily correspond to
perceptually equal sensation
can be converted to/from RGB
Non-linear relationship between RGB values & the

intensity produced in each phosphor dot, low intensity „ like RGB, axes not perceptually uniform
values produce small changes in response to screen „ variant: HLS (hue, lightness, saturation)
„ It is not a good colour description system

15/01/2008 video 16
15/01/2008 video
Video - Basics September, 2000

Video Technology: Video Technology:

Colour Models: HSV Colour Space Colour Models: HSV

„ Non-linear transformation of RGB cube

„ Hue : quality by which we distinguish one family from others
„ Chroma: quality by which we distinguish a strong colour from weak
Green Yellow

„ Value: It is that quality by which we distinguish a light colour from

Cyan Red
a dark one
„ H corresponds to selecting a colour; S corresponds to selecting the
Blue amount of white; selecting V corresponds to adding black
„ Perceptually non-linear
‹ Perceptual in the sense that we are using attributes that we normally
think of
h ‹ Attributes are not independent
s „ variant: HLS (hue, lightness, saturation)

15/01/2008 video 18
15/01/2008 video

Video Technology: Image Data (YUV)

Colour Models: YUV

„ colour model used for TV signal transmission

„ Y represents luminance (intensity of monochrome signal) U (col. diff.) V (col. diff.)

„ U,V carry separate colour information (colour

difference values) RGB Y (luminance)

„ Y = 0.2125R + 0.7154G + 0.0721B

„ U = B-Y, V = R-Y
„ typically, Y contributes most to signal bandwidth
„ See:
‹ [A.K. Jain, Fundamentals of Digital Image Processing, Prentice
Hall, 1988] Y=127

15/01/2008 video
Video - Basics September, 2000

Video Technology: Video Technology:

CIE Colour Specification System Colour Models: CMYK

„ Commission Internationale d’Éclairage „ CMYK = cyan, magenta, yellow, black

„ colour labelling system „ “printer’s model”
„ “XYZ” space „ a subtractive model
„ international standard (1931) „ set of practically available CMYK colours (“process
„ based on colour matching functions determined by colours”) are not equivalent to RGB set
experiments with human subjects
„ gives uniform colour spaces
„ needs transformation into one of the other models

15/01/2008 video 22
15/01/2008 video

Image & Video Capture Video Sequence

Red Green Blue

„ Consists of number of frames

8 bits: 0-255 ‹ Images produced by digitising time-varying signal
generated by the sensors in a camera
‹ Bit-mapped images
„ Camera
‹ Circuitry Inside a Camera
‹ Purely digital signal (data stream) is fed into a computer
Y (luminance) via a high speed interface
) IEEE 1394 (FireWire)
„ Computer
‹ Broadcast video is fed into a video capture card attached
V Time to the computer
‹ Video capture card- analogue signal is converted into a
0(black), … ,255(white) t1(sec) t2 (sec) tN(sec)

digital form
15/01/2008 video 23 24
15/01/2008 video
Video - Basics September, 2000

Video Data Pushing the hardware

„ Desktop PC
‹ CIF (352 x 288), 8 bpp,
30hz = 8.7 MB/sec „ Consumers expectations are based on broadcast
‹ 30 sec clip = 261 MB television

„ Video to mobile device „ Consumer equipment plays back at reduced frame rate
‹ QCIF (176 x 144), 8 bpp,
30 hz = 2.2 MB/sec
resulting in jittery- dropped frames
‹ 30 sec clip = 65 MB
„ In order to accommodate low-end PCs considerable
„ High Definition TV compromises over quality must be made
‹ 1280 x 720, 24 bpp, 50 hz
= 0.4 GB/sec
‹ 2.5 hour movie = 3.4 TB
15/01/2008 video 25 26
15/01/2008 video

Persistence of vision Human Perception

„ If a sequence of still images is presented to our „ What frame rate perceived as smooth?
eyes at sufficiently high rate (frame rate~40 fps), ‹ No identification of single frames if refresh
we experience a continuous visual sensation rather frequency is high enough
than perceiving individual images ‹ Perception of 16 frames/s as continuous sequence
‹ A lag in the eye’s response to visual stimuli which results ‹ Depends on material
in after images
„ More sensitive to low frequencies
„ If the consecutive images only differ by a small
amount, any changes from one to next will be „ More sensitive to changes in luminance and blue-
perceived as movement of elements within images orange axis
„ Film projector displays an image twice (24 fps „ Vision emphasizes edge detection
becomes 48 fps)

15/01/2008 video 28
15/01/2008 video
Video - Basics September, 2000

Digitization: camera vs Image & Video Processing

„ Advantage „ disadvantage „ When processing image/video data we have two choices:
‹ Analogue signal „ User has no control ‹ Raw data … termed uncompressed domain
transmitted on a over digitization ) Direct processing of the pixel values on either a global or local basis
cable get corrupted „ Most conform to an ) Slow - more data, may require decode process
by noise appropriate standard ) Possible to extract a wide range of expressive information from raw
‹ Noise will creep in if data
analogue data is ‹ Encoded data … termed compressed domain
stored on a magnetic ) Parse bitstream and process data contained therein
tape ) Fast - partial image reconstruction, real-time possible
‹ Camera is resistant ) Restricted to image/video data in bitstream
to corruption by ) Compression is about throwing away information for efficient
noise and representation and transmission

15/01/2008 video 15/01/2008 video 30

Video Bit Rate

Calculation Examples
width * height * depth * fps
= bits/sec
compression factor
Width Height Depth fps Comp Kb/sec Notes
160 120 8 15 25 92 Basic Rate ISDN
160 120 16 20 20 307
„ width ~ pixels (160, 320, 640, 720, 1280, 1920, 320 240 8 15 25 369

320 240 16 24 24 1,229 MPEG1 (Primary Rate ISDN)
640 480 16 30 24 6,144 MPEG2
640 480 24 30 6 36,864 MJPEG
„ height ~ pixels (120, 240, 480, 485, 720, 1080, 640 480 24 30 1 221,184 Uncompressed

„ depth ~ bits (1, 4, 8, 15, 16, 24, …)
„ fps ~ frames per second (5, 15, 20, 24, 30, …)
„ compression factor (1, 6, 24, …)

15/01/2008 video 32
15/01/2008 video
Video - Basics September, 2000

Video Data Size Compression

size of uncompressed video in gigabytes
„ When captured, audio/video data is referred to as as
“raw” or “uncompressed”
1920x1080 1280x720 640x480 320x240 160x120
1 sec 0.19 0.08 0.03 0.01 0.00 „ In practice, undergo software/hardware process to
1 min 11.20 4.98 1.66 0.41 0.10 compact data:
1 hour 671.85 298.60 99.53 24.88 6.22
1000 hours 671,846.40 298,598.40 99,532.80 24,883.20 6,220.80
‹ Termed “compression” or “encoding”
‹ Results in an efficient bitstream that can be stored or
image size of video
„ Requires a (less) complex process to uncompress
(decode) before it can be displayed
640x480 (1.33) 320x240 160x120 „ A system for encoding & decoding is termed a “codec”
1280x720 (1.77)

15/01/2008 video 15/01/2008 video 34

Image Compression Image Compression

„ Use frequency domain analysis
‹ The discrete cosine transform (DCT)

15/01/2008 video 35 15/01/2008 video 36

Video - Basics September, 2000

Effects of Compression Compression

„ Two types:
‹ Lossless: doesn’t change data “simply” reorganizes
• Used in medical applications (e.g. X-Rays) and document scanning (e.g.
storage for 1 hour of compressed video in megabytes FAX)
‹ Lossy: throws some data away during encoding
1920x1080 1280x720 640x480 320x240 160x120 • Used in most multimedia applications
1:1 671,846 298,598 99,533 24,883 6,221 „ Popular image/video compression standards for
3:1 223,949 99,533 33,178 8,294 2,074
6:1 111,974 49,766 16,589 4,147 1,037 multimedia applications:
25:1 26,874 11,944 3,981 995 249 ‹ JPEG (still images)
100:1 6,718 2,986 995 249 62
‹ JPEG 2000 (enhanced functionality/quality)
‹ MPEG-1 (video from CD-ROM)
3 bytes/pixel, 30 frames/sec
‹ MPEG-2 (Digital TV, DVD)
‹ MPEG-4 (mobile and content-based functionality)

‹ Also: ITU-T real-time telecommunications standards e.g. H.261,

H.263, H.264/MPEG-4 AVC
15/01/2008 video 15/01/2008 video 38

Video codecs What is MPEG

„ Video capture boards
‹ Digitization and compression
‹ Decompression and digital to analogue transformation
‹ Devices compressor/decompressor (codecs)
„ MPEG: Moving Picture Experts Group (Created in1988)
„ Hardware codecs
‹ Store them on a computer „ ISO (Int. Standards Organization) / IEC (Int. Electro-
‹ Then play them back to an external video monitor (TV technical Commission)
set) attached to the VCC
‹ ISO/IEC JTC 1 / SC 29 / WG 11
‹ Most hardware codecs can not provide full motion video
to monitor
‹ We can not know our audience will have any hardware „ Develop standards for the coded representation of
codec available moving pictures and associated audio
„ Software codec
‹ Program that performs the same operation

15/01/2008 video 15/01/2008 video 40
Video - Basics September, 2000

Video Technology: Vector Quantization

„ motion JPEG „ Iterative algorithm
„ just applies JPEG to each frame ‹Pickset of reference blocks (code book)
‹ YCBCR ‹Code picture blocks by code book entries
‹ apply to each channel ‹Entropy/RLE code the code symbols
„ used for compression during video capture „ How to select code book
„ compression ratios of 7:1 ‹Step 1: pick reference blocks
„ no temporal compression ‹Step 2: compare reconstructed image to original
„ Allows users to set quality parameters ‹Step 3: add additional reference blocks
not a standard

„ Slow encode, fast decode

15/01/2008 video 41 15/01/2008 video 42

MPEG Standards Video Technology: MPEG-1

compression approach
„ MPEG-1: Storage of moving picture and audio on storage „ Spatial compression for individual frames
media (CD-ROM) 11 / 1992 „ based on JPEG-like technique
‹ aimed a low bit-rates of 1.5 Mb/s „ temporal compression of sequences of frames
‹ typical of CD-ROM ‹ looks for areas of change
„ MPEG-2: Digital television 11 / 1994 ‹ creates difference frames
‹ aimed at bit rates of 8-15 Mb/s ‹ based on 16X16 macroblocks
„ MPEG-4: Coding of natural and synthetic media objects
for multimedia applications v1: 09 / 1998
v2: 11 / 1999
‹ introduction of objects into the specification
‹ wide range of data rates
‹ important for multimedia
„ MPEG-7: Multimedia content video
description for AV material 43 15/01/2008 video 44

08 / 2001
Video - Basics September, 2000

Temporal Compression Motion Vectors

„ Make use of similarities of frames „ Algorithm searches for Best matching Block
‹ Only difference between frames is encoded
„ Needs to calculate error term (Matching block)
‹ Process often termed motion compensation
„ Needs to capture/convey spatial translation
‹ Motion vector

S1 S2
Second one (s2) can be approximated by pieces of the first one
S1 acts as a reference frame
15/01/2008 video 45 15/01/2008 video 46

Predicted Frames Bidirectional frames

„ Consider S2
„ Consider S3
‹ Has macroblocks in common with S1 and S3
‹ Has macroblocks in common with S1
‹ Could be constructed using pieces of S1 and S3
‹ Could be reconstructed from S1
‹ S2 would be then a Bidirectional (B) frame
‹ S3 would be then a Predicted (P) frame
Both S1 and S3 acting as reference frames

15/01/2008 video 47 15/01/2008 video 48

Video - Basics September, 2000

Question? Summary (from example)

„ How can we know at the time S2 is coded that there will be „ S1 is an I frame – it is encoded without reference to any
a matching block in S3? other frame
„ Answer: „ S3 is a Pframe – it is predicted froma reference frame: in
‹ S3 needs to be available for reference at the time of F2 is this case S1
coded „ S2 is a B frame – it is interpolated from S1 and s3
‹ i.e., S1, S2, S3 would need to be buffered
‹ S2 only sent (transmission order) once it has been
„ Display Order
interpolated from S1 and S3

15/01/2008 video 49 15/01/2008 video 50

Bitstream order GOPS…

„ Encoders typically use a repeating sequence of I, P and B

„ What about decoder …
„ This is known as a GOP (Group of pictures)
„ How to handle B frames
‹ Always begin with a I frame
‹ Needs info from later I or P frames in order to construct B frame
„ Display Order ‹ Common sequence (display Order)
) N=9
‹ Bitstream order

‹ Solution: reorder the sequence

‹ Display
15/01/2008 order -> bitstream order video
IBP to IPB 51 15/01/2008 video 52
Video - Basics September, 2000

Video Sequence Role of I frames

„ Commence with a sequence header
„ Followed by n GOPS where n> 0 IPBBPBBIBB
„ End with a sequence_end_code
„ GOP You want to resume from a given frame …
‹ Each GOP must contain at least I frame What if frame is I frame
‹ Assist random access into the sequence P frame
) Therefore greater apps need for RA the shorter should be the
B frame
size of GOP
I frames act as synchronisation points
Delay between occurrence of successive I frames should
not exceed 400ms
15/01/2008 video 53 15/01/2008 video 54

Video Technology: MPEG Video Technology: MPEG

Frame Types: I Frames Frame Types: P Frames
„ Intra-coded images „ Predictive coded frames
‹ similar to a JPEG still of the frame „ based on predicting the movement of blocks from their position in the
„ Expensive but required previous frame (I or P)
‹ I-frames expensive as they have to compress the entire scene
‹ needed as start frame for differences
‹ needed for scene changes

15/01/2008 video 55 15/01/2008 video 56

Video - Basics September, 2000

Video Technology: MPEG MPEG 2

Frame Types: B Frames
„ Bi-directional frames „ Motivation …
‹ based on pair of I/P frames, before and after ‹ Provide different qualities if image for different domains (with
differing target bit rates)
) E.g., studio quality motion video
‹ MPEG-2 took on the mantle of MPEG-3
) Encoding and compression for HDTV
‹ Standard for digital broadband TV
‹ Interlaced video
‹ DVD quality

15/01/2008 video 57 15/01/2008 video 58

Profiles and levels MPEG-4

„ MPEG-2 supports greater choice of bit rate „ Motivation …
‹ Up to HDTV picture size and resolution ‹ Original objective: develop a low bit rate video compression method

‹ Allows greater chrominance resolution ‹ Now a set of tools for interactive multimedia scene composition,
) 4:2:2; 4:4:4 multiplexing and synchronisation
‹ Support for wider range of apps ) Digital television
) Family of compression schemes ) Interactive graphics application
) Schemes defined by a profile and level
) Interactive multimedia
• No single encoder/decoder has to implement all functionality „ MPEG-4 provides
• Comparability between newer and older equipment ‹ The standardised technological elements enabling the integration of
‹ 5 Profiles production, distribution and content access paradigm of the fields of
) High, Main, Simple, Spatially scalable, SNR scalable,4:2:2, interactive multimedia, mobile multimedia,…
multiview etc.

15/01/2008 video 59 15/01/2008 video 60

You might also like