L5-6 - Data Representation (Part 2-3)
L5-6 - Data Representation (Part 2-3)
L5-6 - Data Representation (Part 2-3)
(Part 2-3)
• A graphic, what we see A pixel
with multimedia, is really
just a bunch of pixels
both in horizontal and
vertical direction.
• In the simplest form,
each of the dots or pixels
is a bunch of 0s and 1s.
Data Representation 2
1 1 0 0 0 0 1 1
1 0 1 1 1 1 0 1
0 1 0 1 1 0 1 0
0 1 1 1 1 1 1 0
0 1 0 1 1 0 1 0
0 1 1 0 0 1 1 0
1 0 1 1 1 1 0 1
1 1 0 0 0 0 1 1
Data Representation 3
What if we use 2 bits
to represent a pixel?
Data Representation 4
More bits
Data Representation 5
https://goo.gl/tGJ3h3
Data Representation 6
• HOW TO REPRESENT COLORS? → RGB Images
• RGB stands for Red Green Blue
o We generally use 8 bits for each color (red/green/blue)
o i.e. total 8 x 3 = 24 bits to represent the color of a pixel.
Data Representation 7
• RGB stands for Red Green Blue
o With information giving an amount of red, an amount of green, and an
amount of blue, you can tell a computer how to colorize pixels
o None of the colors yields a black pixel
o All of the colors yields a white pixel
o In between these two options is where we get all sorts of colors
Data Representation 8
11111111 00000000 00000000 00000000 00000000 11111111 00000000 11111111 00000000
Data Representation 9
11111111 11111111 00000000 11111111 00000000 11111111 00000000 11111111 11111111
Data Representation 10
▪ Two factors that affect the quality of an image:
1. Bit depth: The amount of bits available for each pixel in an image.
2. Resolution:
➢ Resolution refers to the number of pixels in an image. Resolution is
sometimes identified by the width and height of the image as well as the total
number of pixels in the image.
➢ For example, an image that is 2048 pixels wide and 1536 pixels high (2048 x
1536) contains (multiply) 3,145,728 pixels (or 3.1 Megapixels). You could call it
a 2048 x 1536 or a 3.1 Megapixel image.
Data Representation 11
If we keep the image size the same and increase the
resolution, the image gets sharper and more detailed.
(The opposite happens if we decrease the resolution.)
Data Representation 12
Original (400x262)
Data Representation 13
Half Size (200x131)
Data Representation 14
Selecting
resolution
Data Representation 15
Q1) Using 6 bits, we can represent _____ different things at most.
a) 6
b) 64
c) 12
d) 32
e) 128
Data Representation 16
Q2) A _____is a standard way of storing binary data in a computer.
a) metadata
b) sample rate
c) file format
d) pixel
Data Representation 17
Q3) A _____ image is one in which the only colors are shades of gray.
a) grayscale
b) binary
c) RGB
d) selfie
Data Representation 18
Q4) Which factor affects the quality of an image? (Select all that applies)
a) ASCII code
b) Bit depth
c) Sampling rate
d) Resolution
Data Representation 19
Source: https://www.youtube.com/watch?v=fGASncJR_kg
Data Representation 21
▪ Video formats are just a bunch of images shown quickly in succession to
create the illusion of motion.
▪ Common video file formats: MP4, FLV, AVI etc.
Data Representation 23
▪ Data compression: Reducing the amount of space needed to store a
piece of data
▪ Why compress files?
• Saving storage space
• Fast data transfer
▪ Compression ratio: The size of the compressed data divided by the size
of the uncompressed data
Data Representation 24
▪ Bandwidth: The number of bits or bytes that can be transmitted from
one place to another in a fixed amount of time
▪ Compression ratio: The size of the compressed data divided by the size
of the uncompressed data
Data Representation 25
▪ 3 types of text compressions:
1. Keyword encoding
2. Run-length encoding
3. Huffman encoding
Data Representation 26
Keyword encoding
▪ In this text compression technique, we replace a frequently used word
with a single character
▪ For example, suppose we used the following chart to encode a few
words:
Data Representation 27
Keyword encoding Original paragraph:
The human body is composed of many independent systems, such
as the circulatory system, the respiratory system, and the
reproductive system. Not only must all systems work independently,
but they must interact and cooperate as well. Overall health is a
function of the well-being of separate systems, as well as how these
separate systems work in concert.
Encoded paragraph:
The human body is composed of many independent systems, such ^
~ circulatory system, ~ respiratory system, + ~ reproductive system.
Not only & each system work independently, but they & interact +
Original paragraph → 352 characters
cooperate ^ %. Overall health is a function of ~ %-being of separate
Encoded paragraph → 317 characters
Compression ratio → 317/352 or 0.9 systems, ^ % ^ how # separate systems work in concert.
Data Representation 28
Keyword encoding
▪ Limitations:
1. The character code we use to replace a word cannot be used in the passage (eg, $)
2. The word The and the cannot be encoded by the same character because they
contain different letters
3. We would not gain anything by encoding things like “I” or “a”
4. Save per word is small
▪ Advantage:
The encoded patterns are generally complete words rather than suffix’s. For example the
word dig = ~, digging= ~ing. This allows the pattern being encoded generally appear more
often then the whole word
Data Representation 29
Run-length encoding
▪ In this text compression technique, we replace a long series of a repeated
character with a count of the repetition.
• It is also sometimes called recurrence coding.
▪ This type of repetition doesn’t generally take place in English text, but often
occurs in large data streams, such as DNA sequences.
Data Representation 30
Run-length encoding
▪ Examples
1: AAAAAAA = *A7
2: nnnnnxxxxxxxxxccchhhhhh = *n5*x9ccc*h6
Data Representation 31
Huffman encoding
▪ In this text compression technique, we use a variable-length binary string
to represent a character so that frequently used characters have short
codes
▪ For example, suppose we use the following Huffman encoding to
represent a few characters:
Data Representation 32
▪ Any kind of data can be compressed. There are two main categories of
compression:
1. lossy
2. lossless
Data Representation 34
▪ Lossless compression
• Lossless compression doesn’t reduce the quality of the file at all.
• Since no data is lost, lossless compression allows a file to be retrieved exactly
as it was when originally created.
Data Representation 35
▪ What do the bytes of this image look like? Well, since the top of the flag
is just a solid color, we’re going to have a whole lot of bytes that are
exactly the same. Wouldn’t it be nice if we could just say “here come 100
red pixels” rather than listing out each pixel individually?
Data Representation 36
Lossless compression
▪ There is a lot of repeated blue in the first image
• Using the same 24 bits to represent each pixel!
▪ The second image is compressed and not what a user would see
Data Representation 37
▪ Lossy compression
• Lossy compression removes some of a file’s original data in order to reduce
the file size.
• This might mean reducing the numbers of colors in an image or reducing the
number of samples in a sound file.
• This can result in a small loss of quality of an image or sound file.
• The space savings of lossy compression are higher than they are with lossless
compression.
Data Representation 38
▪ Lossy compression
Data Representation 40
Image File Formats:
▪ JPEG (Joint Photographic Experts Group)
• Supports 24-bit color
• Uses lossy compression
▪ PNG (Portable Network Graphics)
• High quality graphics
• Supports 24-bit color
• Uses lossless compression
▪ BMP (Bitmap)
• Originally used by Windows
• Not super common these days
Data Representation 41
The night before exam
Data Representation 42
Image File Formats:
• GIF (Graphics Interchange Format)
o Low quality images
o Only supports up to 8-bit color
o Often used for memes
o Can be animated
o Like a video file with only a few images
Data Representation 43
Information layer → completed ^_^
Data Representation 44
1. Computer Science Illuminated – Nell Dale, John Lewis
• Chapter 3
2. https://www.bbc.co.uk/bitesize/subjects/zvc9q6f
3. https://www.bbc.co.uk/bitesize/guides/zjfgjxs/revision/1
4. https://cs50.harvard.edu/technology/2017/
Data Representation 45