4 Ip

Save from: www.uotechnology.edu.
iq
4
th
Image Processing
class
) (
: . . .

Chapter One Introduction to Computer Vision
and Image Processing
1

Introduction to Computer Vision

1.1 Computer Imaging
Can be defined a acquisition and processing of visual information by
computer. Computer representation of an image requires the equivalent of
many thousands of words of data, so the massive amount of data required for
image is a primary reason for the development of many sub areas with field
of computer imaging, such as image compression and segmentation .Another
important aspect of computer imaging involves the ultimate receiver of
visual information in some case the human visual system and in some cases
the human visual system and in others the computer itself.
Computer imaging can be separate into two primary categories:
1. Computer Vision.
2. Image Processing.
(In computer vision application the processed images output for use by a
computer, whereas in image processing applications the output images are
for human consumption).
These two categories are not totally separate and distinct. The boundaries
that separate the two are fuzzy, but this definition allows us to explore the
differences between the two and to explore the difference between the two
and to understand how they fit together (Figure 1.1).

2

Computer imaging can be separated into two different but
overlapping areas.
Figure (1.1 )Computer Imaging [1].

Historically, the field of image processing grew from electrical
engineering as an extension of the signal processing branch, whereas are the
computer science discipline was largely responsible for developments in
computer vision.

Computer vision computer imaging where the application doses not
involve a human being in visual loop. One of the major topics within this
field of computer vision is image analysis.
1.2 Computer Vision
1. Image Analysis: involves the examination of the image data to
facilitate solving vision problem.
The image analysis process involves two other topics:
Feature Extraction: is the process of acquiring higher level
image information, such as shape or color information.
Pattern Classification: is the act of taking this higher level
information and identifying objects within the image.

Computer Vision

Image Processing
3
Computer vision systems are used in many and various types of
environments, such as:
1. Manufacturing Systems
2. Medical Community
3. Law Enforcement
4. Infrared Imaging
5. Satellites Orbiting.

Image processing is computer imaging where application involves a human
being in the visual loop. In other words the image are to be examined and a
acted upon by people.
1.3 Image Processing
The major topics within the field of image processing include:
1. Image restoration.
2. Image enhancement.
3. Image compression.

Is the process of taking an image with some known, or estimated
degradation, and restoring it to its original appearance. Image restoration is
often used in the field of photography or publishing where an image was
somehow degraded but needs to be improved before it can be printed(Figure
1.2).
1.3.1 Image Restoration
4

a. Image with distortion b. Restored image
Figure (1.2) Image Restoration
Involves taking an image and improving it visually, typically by taking
advantages of human Visual Systems responses. One of the simplest
enhancement techniques is to simply stretch the contrast of an image.
1.3.2 Image Enhancement
Enhancement methods tend to be problem specific. For example, a method
that is used to enhance satellite images may not suitable for enhancing
medical images.
Although enhancement and restoration are similar in aim, to make an image
look better. They differ in how they approach the problem. Restoration
method attempt to model the distortion to the image and reverse the
degradation, where enhancement methods use knowledge of the human
visual systems responses to improve an image visually.
5

a. image with poor contrast b. Image enhancement by
contrast stretching
Figure (1.3) Image Enhancement

Involves reducing the typically massive amount of data needed to
represent an image. This done by eliminating data that are visually
unnecessary and by taking advantage of the redundancy that is inherent in
most images. Image processing systems are used in many and various types
of environments, such as:
1.3.1 Image Compression
1. Medical community
2. Computer Aided Design
3. Virtual Reality
4. Image Processing.

6

a. Image before compression b. Image after compression
(92) KB (6.59)KB
Figure (1.4) Image Compression.

Computer imaging systems are comprised of two primary components
types, hardware and software. The hard ware components can be divided
into image acquiring sub system (computer, scanner, and camera) and
display devices (monitor, printer).The software allows us to manipulate the
image and perform any desired processing on the image data.
1.4 Computer Imaging Systems

The process of transforming a standard video signal into digital image
.This transformation is necessary because the standard video signal in analog
(continuous) form and the computer requires a digitized or sampled version
of that continuous signal. The analog video signal is turned into a digital
image by sampling the continuous signal at affixed rate. In the figure below
we see one line of a video signal being sampled (digitized) by
instantaneously measuring the voltage of the signal at fixed intervals in time.
1.5 Digitization
The value of the voltage at each instant is converted into a number
that is stored, corresponding to the brightness of the image at that point.
7
Note that the image brightness of the image at that point depends on both the
intrinsic properties of the object and the lighting conditions in the scene.

Figure (1.5) Digitizing (Sampling ) an Analog Video Signal[1].

The image can now be accessed as a two-dimension array of data ,
where each data point is referred to a pixel (picture element).for digital
images we will use the following notation :
I(r,c) =The brightness of image at the point (r,c)
Where r=row and c=column.
One line of information
v
o
l
t
a
g
e
Time
Digitizing
(sampling)
an analog
video signal
One
pixel
8
When we have the data in digital form, we can use the software to process
the data.
The digital image is 2D- array as:
I(0,0) I(0,1) ..I(0,N-1)
I(1,0) I(1,1) ..I(1,N-1)

.
I(N-1,0) I(N-1,1) ...I(N-1,N-1)

In above image matrix, the image size is (NXN) [matrix dimension] then:
Ng=2
m
Where Ng denotes the number of gray levels m is the no. of bits contains in
digital image matrix.
..(1)
Example :If we have (6 bit)in 128 X 128 image .Find the no. of gray levels
to represent it ,then find t he no. of bit in this image?
Solution
N
:
g
=2
6
N
=64 Gray Level
b
=128 * 128* 6=9.8304 * 10
4

bit
The Human Visual System (HVS) has two primary components:
1.6 The Human Visual System
Eye.
Brian.
* The structure that we know the most about is the image receiving sensors
(the human eye).
9
* The brain can be thought as being an information processing unit
analogous to the computer in our computer imaging system.
These two are connected by the optic nerve, which is really abundle of
nerves that contains the path ways for visual information to travel from the
receiving sensor (the eye) to the processor (the brain).

The resolution has to do with ability to separate two adjacent pixels
as being separate, and then we can say that we can resolve the two. The
concept of resolution is closely tied to the concepts of spatial frequency.
1.7 Image Resolution
Spatial frequency concept, frequency refers to how rapidly the signal is
changing in space, and the signal has two values for brightness-0 and
maximum. If we use this signal for one line (row) of an image and then
repeat the line down the entire image, we get an image of vertical stripes. If
we increase this frequency the strips get closer and closer together, until they
finally blend together.

a. Low Freq. =2 b. Low Freq. =3 c. Low Freq. =5
Figure (1.6) Resolution and Spatial Frequency[1].

10
In image we observe many brightness levels and the vision system can adapt
to a wide range. If the mean value of the pixels inside the image is around
Zero gray level then the brightness is low and the images dark but for mean
value near the 255 then the image is light. If fewer gray levels are used, we
observe false contours () bogus lines resulting from gradually changing
light intensity not being accurately represented.

U1.9 Image Representation
We have seen that the human visual system (HVS) receives an input
image as a collection of spatially distributed light energy; this is form is
called an optical image. Optical images are the type we deal with every
day cameras captures them, monitors display them, and we see them
[we know that these optical images are represented as video information
in the form of analog electrical signals and have seen how these are
sampled to generate the digital image I(r , c).
The digital image I (r, c) is represented as a two- dimensional array of
data, where each pixel value corresponds to the brightness of the image at
the point (r, c). in linear algebra terms , a two-dimensional array like our
image model I( r, c ) is referred to as a matrix , and one row ( or column)
is called a vector.
The image types we will consider are:

1. UBinary Image
Binary images are the simplest type of images and can take on two
values, typically black and white, or 0 and 1. A binary image is
11
referred to as a 1 bit/pixel image because it takes only 1 binary digit to
represent each pixel.
1.8 Image brightness Adaption
These types of images are most frequently in computer vision application
where the only information required for the task is general shapes, or
outlines information. For example, to position a robotics gripper to grasp
( ) an object or in optical character recognition (OCR).
Binary images are often created from gray-scale images via a threshold
value is turned white (1), and those below it are turned black (0).

Figure (1.7) Binary Images.
2. UGray Scale Image
Gray _scale images are referred to as monochrome, or one-color image.
They contain brightness information only brightness information only, no
color information. The number of different brightness level available.
The typical image contains 8 bit/ pixel (data, which allows us to have (0-
255) different brightness (gray) levels. The 8 bit representation is
typically due to the fact that the byte, which corresponds to 8-bit of data,
is the standard small unit in the world of digital computer.

12

Figure (1.8) Gray Scale Images.
3.
Color image can be modeled as three band monochrome image data,
where each band of the data corresponds to a different color.
Color Image

Figure (1.8) Color Images.

The actual information stored in the digital image data is brightness
information in each spectral band. When the image is displayed, the
corresponding brightness information is displayed on the screen by
picture elements that emit light energy corresponding to that particular
color.
13
Typical color images are represented as red, green ,and blue or RGB
images .using the 8-bit monochrome standard as a model , the
corresponding color image would have 24 bit/pixel 8 bit for each color
bands (red, green and blue ). The following figure we see a representation
of a typical RGB color image.
I
R
(r,c) I
G
(r,c) I
B

(r,c)

Figure (1.9) Typical RGB color image can be thought as three
separate images I
R
(r,c),I
G
(r,c),I
B

(r,c) [1]
The following figure illustrate that in addition to referring to arrow or
column as a vector, we can refer to a single pixel red ,green, and blue
values as a color pixel vector (R,G,B ).

Figure (1.10) A color pixel vector consists of the red, green and blue
pixel values (R, G, B) at one given row/column pixel
coordinate( r , c) [1].
Blue
Green
Red
14

For many applications, RGB color information is transformed into
mathematical space that that decouples the brightness information from
the color information.
The hue/saturation /lightness (HSL) color transform allows us to describe
colors in terms that we can more readily understand.

White

Black
Figure (1.11) HSL Color Space[1].
The lightness is the brightness of the color, and the hue is what we
normally think of as color and the hue (ex: green, blue, red, and
orange).
The saturation is a measure of how much white is in the color (ex: Pink is
red with more white, so it is less saturated than a pure red).
[Most people relate to this method for describing color}.
Example: a deep, bright orange would have a large intensity
(bright), a hue of orange, and a high value of saturation (deep).we
Saturation

Zero

Full
15
can picture this color in our minds, but if we defined this color in terms
of its RGB components, R=245, G=110 and B=20.
Modeling the color information creates a more people oriented way of
describing the colors.
4.
Multispecrtal images typically contain information outside the normal
human perceptual range. This may include infrared ( ),
ultraviolet ( ), X-ray, acoustic or radar data. Source of these
types of image include satellite systems underwater sonar systems and
medical diagnostics imaging systems.

1.10U Digital Image File Format
Why do we need so many different types of image file format?
The short answer is that there are many different types of images
and application with varying requirements.
A more complete answer, also considers market share proprietary
information, and a lack of coordination within the imaging
industry.
Many image types can be converted to one of other type by easily
available image conversion software. Field related to computer imaging
is that computer graphics.

1.9.1 UComputer Graphics :
Computer graphics is a specialized field within that refers to the computer
science realm that refers to the reproduction of visual data through the use of
computer.
16
In computer graphics, types of image data are divided into two primarily
categories:
Multispectral Images
1. Bitmap image (or raster image): can represented by our image
model I(r, c), where we have pixel data and corresponding brightness
values stored in file format.
2. Vector images: refer to the methods of representing lines, curves
shapes by storing only the key points. These key points are sufficient
to define the shapes, and the process of turing theses into an image is
called rending after the image has been rendered, it can be thought of
as being in bit map format where each pixel has specific values
associated with it.
Most the type of file format fall into category of bitmap images. In general,
these types of images contain both header information and the raw pixel
data. The header information contain information regarding
1. The number of rows(height)
2. The number of columns(Width)
3. The number of bands.
4. The number of bit per pixel.
5. the file type
6. Additionally, with some of the more complex file formats, the
header may contain information about the type of compression
used and other necessary parameters to create the image, I(r,c).

2.9.2
1. BMP format:
Image File Format :
It is the format used by the windows, its a compressed format and the
data of image are located in the field of data while there are two fields ,
17
one for header (54 byte) that contains the image information such as
(height ,width , no. of bits per pixel, no of bands , the file type).
The second field is the color map or color palette for gray level image,
where its length is 0-255).
2. Bin file format:
It is the raw image data I(r,c) with no header information.
3. PPM file format :
It contain raw image data with simplest header, the PPM format, include
PBM(binary),PGM(gray),PPM (color), the header contain a magic
number that identifies the file.
4. TIFF(Tagged Image File Format) and GIF(Graphics Interchange
Format):
They are used on World Wide Web (WWW). GIF files are limited to a
maximum of 8 bits/pixel and allows for a type of compression called
LZW. The GIF image header is 13 byte long & contains basic
information.
5. JPEG (Joint photo Graphic Experts Group):
It is simply becoming standard that allows images compressed algorithms
to be used in many different computer platforms.
J PEG images compression is being used extensively on the WWW. Its,
flexible, so it can create large files with excellent image equality.

6. VIP(visualization in image processing )formats:
It is developed for the CVIP tools software, when performing temporary
images are created that use floating point representation which is beyond
the standard 8 bit/pixel. To represent this type of data the remapping is
18
used, which is the process of taking original image and adding an
equation to translate it to the rang (0-225).

Q1/ What are the applications of computer vision?
Questions:
Q2/What are the applications of image processing, describe them?
Q3/What are the different between raster image and vector image?
Q4/ Find the number of gray level , and the number of bit for (512 512 )
image, note that the image contains 8 bit / pixel ?

Chapter_ Two Image Analysis

1

Chapter two
Image Analysis
2.1
Image analysis involves manipulating the image data to determine
exactly the information necessary to help solve a computer imaging
problem. This analysis is typically part of a larger process, is iterative in
nature and allows us to answer application specific equations: Do we need
color information? Do we need to transform the image data into the
frequency domain? Do we need to segment the image to find object
information? What are the important features of the image?
Image Analysis
Image analysis is primarily data reduction process. As we have seen, images
contain enormous amount of data, typically on the order hundreds of
kilobytes or even megabytes. Often much of this information is not
necessary to solve a specific computer imaging problem, so primary part of
the image analysis task is to determine exactly what information is
necessary. Image analysis is used both computer vision and image
processing.
For computer vision, the end product is typically the extraction of
high-level information for computer analysis or manipulation. This high-
level information may include shape parameter to control a robotics
manipulator or color and texture features to help in diagnosis of a skin
tumor.
In image processing application, image analysis methods may be used
to help determine the type of processing required and the specific parameters
needed for that processing. For example, determine the degradation function

2
for an image restoration procedure, developing an enhancement algorithm
and determining exactly what information is visually important for image
compression methods.

2.2
The image analysis process can be broken down into three primary stages:
System Model
1. Preprocessing.
2. Data Reduction.
3. Features Analysis.

1. Preprocessing:
Is used to remove noise and eliminate irrelevant, visually unnecessary
information. Noise is unwanted information that can result from the
image acquisition process, other preprocessing steps might include:
Gray level or spatial quantization (reducing the number of bits
per pixel or the image size).
Finding regions of interest for further processing.

2. Data Reduction:
Involves either reducing the data in the spatial domain or transforming
it into another domain called the frequency domain, and then extraction
features for the analysis process.

3. Features Analysis:
The features extracted by the data reduction process are examine and
evaluated for their use in the application.

3
After preprocessing we can perform segmentation on the image in the
spatial domain or convert it into the frequency domain via a
mathematical transform. After these processes we may choose to filter
the image. This filtering process further reduces the data and allows us to
extract the feature that we may require for analysis.

2.3
The preprocessing algorithm, techniques and operators are use to
perform initial processing that makes the primary data reduction and
analysis task easier. They include operations related to:
Preprocessing
Extracting regions of interest.
Performing basic algebraic operation on image.
Enhancing specific image features.
Reducing data in resolution and brightness.
Preprocessing is a stage where the requirements are typically obvious
and simple, such as removal of artifacts from images or eliminating of
image information that is not required for the application. For example,
in one application we needed to eliminate borders from the images that
have been digitized from film. Another example of preprocessing step
involves a robotics gripper that needs to pick and place an object ; for this
we reduce a gray-level image to binary (two-valued) image that contains
all the information necessary to discern the object s outlines.

2.3.1
Often, for image analysis we want to investigate more closely a specific
area within the image, called region of interest (ROI). To do this we need
Region of-Interest Image Geometry

4
operation that modifies the spatial coordinates of the image, and these are
categorized as image geometry operations. The image geometry
operations discussed here include:
Crop ( ) , Zoom, enlarge , shrink, translate and rotate.
The image crop process is the process of selecting a small portion of
the image, a sub image and cutting it away from the rest of the image.
After we have cropped a sub image from the original image we can zoom
in on it by enlarge it. The zoom process can be done in numerous ways:
1. Zero-Order Hold.
2. First _Order Hold.
3. Convolution.

1. Zero-Order hold: is performed by repeating previous pixel values,
thus creating a blocky effect as in the following figure:

Figure (2.1): Zero _Order Hold Method

5
2. First _Order Hold: is performed by finding linear interpolation
between a adjacent pixels, i.e., finding the average value between two
pixels and use that as the pixel value between those two, we can do
this for the rows first as follows:

Original Image Array Image with Rows Expanded
8 4 8 8 6 4 6 8
4 8 4 4 6 8 6 4
8 2 8 8 5 2 5 8

The first two pixels in the first row are averaged (8+4)/2=6, and this
number is inserted between those two pixels. This is done for every pixel
pair in each row.
Next, take result and expanded the columns in the same way as follows:
Image with rows and columns expanded
8 6 4 6 8
6 6 6 6 6
4 6 8 6 4
6 5.5 5 5.5 6
8 5 2 5 8
This method allows us to enlarge an NN sized image to a size of (2N-1)
(2N-1) and be repeated as desired.

3- Convolution: this process requires a mathematical process to enlarge an
image. This method required two steps:
1. Extend the image by adding rows and columns of zeros
between the existing rows and columns.

6
2. Perform the convolution.
The image is extended as follows:

Original Image Array Image extended with zeros
3 5 7 0 0 0 0 0 0 0
2 7 6 0 3 0 5 0 7 0
3 4 9 0 0 0 0 0 0 0
0 2 0 7 0 6 0
0 0 0 0 0 0 0
0 3 0 4 0 9 0
0 0 0 0 0 0 0

Next, we use convolution mask, which is slide a cross the extended image,
and perform simple arithmetic operation at each pixel location

Convolution mask for first order hold

1

The convolution process requires us to overlay the mask on the image,
multiply the coincident () values and sum all these results. This is
equivalent to finding the vector inner product of the mask with underlying
sub image. The vector inner product is found by overlaying mask on sub
image. Multiplying coincident terms, and summing the resulting products.

7
For example, if we put the mask over the upper-left corner of the image, we
obtain (from right to left, and top to bottom):

1/4(0) +1/2(0) +1/4(0) +1/2(0) +1(3) +1/2(0) + 1/4(0) +1/2(0) +1/4(0) =3
Note that the existing image values do not change. The next step is to slide
the mask over by on pixel and repeat the process, as follows:
1/4(0) +1/2(0) +1/4(0) +1/2(3) +1(0) +1/2(5) + 1/4(0) +1/2(0) +1/4(0) =4
Note this is the average of the two existing neighbors. This process
continues until we get to the end of the row, each time placing the result of
the operation in the location corresponding to center of the mask.
When the end of the row is reached, the mask is moved down one row, and
the process is repeated row by row. This procedure has been performed on
the entire image, the process of sliding, multiplying and summing is called
convolution.

Figure (2.2): First _Order Hold Method


8
Note that the output image must be put in a separate image array called a
buffer, so that the existing values are not overwritten during the convolution
process.

Mask
The convolution process

Image buffer
Result of Summation

a. Overlay the convolution mask in the upper-left corner of the image.
Multiply coincident terms, sum, and put the result into the image buffer
at the location that corresponds to the masks current center, which is
(r,c)=(1,1).

X


9
Mask Image buffer

b. Move the mask one pixel to the right , multiply coincident terms sum ,
and place the new results into the buffer at the location that corresponds to
the new center location of the convolution mask which is now at (r,c)=(1,2),
continue to the end of the row.

Mask Image buffer

c. Move the mask down on row and repeat the process until the mask is
convolved with the entire image. Note that we lose the outer row(s) and
columns(s).

X

X


10
Why we use this convolution method when it require, so many more
calculation than the basic averaging of the neighbors method?
The answer is that many computer boards can perform convolution in
hardware, which is generally very fast, typically much faster than applying a
faster algorithm in software. Note, only first-order hold be performed via
convolution, but zero-order hold can also achieved by extending the image
with zeros and using the following convolution mask.

Zero-order hold convolution mask
1 1
1 1

Note that for this mask we will need to put the result in the pixel location
corresponding to the lower-right corner because there is no center pixel.
These methods will only allows us to enlarge an image by a factor of (2N-1),
but what if we want to enlarge an image by something other than a factor of
(2N-1)?
To do this we need to apply a more general method. We take two adjacent
values and linearly interpolate more than one value between them. This is
done by define an enlargement number k and then following this process:
1. Subtract the result by k.
2. Divide the result by k.
3. Add the result to the smaller value, and keep adding the result from
the second step in a running total until all (k-1) intermediate pixel
locations are filled.


11
Example:
1. Find the difference between the two values, 140-125 =15.
we want to enlarge an image to three times its original size, and
we have two adjacent pixel values 125 and 140.
2. The desired enlargement is k=3, so we get 15/3=5.
3. next determine how many intermediate pixel values .we need :
K-1=3-1=2. The two pixel values between the 125 and 140 are
125+5=130 and 125+2*5 = 135.
We do this for every pair of adjacent pixels .first along the rows and
then along the columns. This will allows us to enlarge the image by
any factor of K (N-1) +1 where K is an integer and NN is the image
size.
To process opposite to enlarging an image is shrinking. This process
is done by reducing the amount of data that need to be processed.
Two other operations of interest image geometry are: Translation and
Rotation. These processes may be performed for many application
specific reasons, for example to align an image with a known template
in pattern matching process or make certain image details easer to see.


12
2.4
There are two primary categories of algebraic operations applied to image:
Image Algebra
1. Arithmetic operations.
2. Logic operations.

Addition, subtraction, division and multiplications comprise the arithmetic
operations, while AND, OR and NOT makeup the logic operations. These
operations which require only one image, and are done on a pixel by-pixel
basis.
To apply the arithmetic operations to two images, we simply operate on
corresponding pixel values. For example to add image I
1
and I
2
to create I
3
I
:
1
I
2
I
3 4 7 6 6 6 3+6 4+6 7+6 9 10 13
3

3 4 5 + 4 2 6 = 3+4 4+2 5+6 = 7 6 11
2 4 6 3 5 5 2+3 4+5 6+5 5 9 11

Addition is used to combine the information in two images.
Applications include development of image restoration algorithm for
molding additive noise, and special effects, such as image morphing
in motion pictures.
Subtraction of two images is often used to detect motion consider the
case where nothing has changed in a sense; the image resulting from
subtraction of two sequential image is filled with zero-a black image.
If something has moved in the scene, subtraction produces a nonzero
result at the location of movement. Applications include

Object

13
tracking , Medical imaging, Law enforcement and Military
applications
Multiplication and Division are used to adjust the brightness of an
image. One image typically consists of a constant number greater than
one. Multiplication of the pixel values by a number grater than one
will darken the image (Brightness adjustment is often used as a
processing step in image enhancement).

The logic operations AND, OR and NOT form a complete set, meaning that
any other logic operation (XOR, NOR, NAND) can be created by a
combination of these basic elements. They operate in a bit-wise fashion on
pixel data.

a. First Original image b. Second Original c. Addition of two images
Figure (2.3): Image Addition.

Addition of images

14
a. Cameraman image
b. X-ray image of hand
c. Multiplication of two images

Figure (2.4): Image Subtraction.

Figure (2.5): Image Multiplication.
a Original scene
b. Same scene later
b Same scene later
c. Subtraction of scene a from scene b

15
Figure (2.6): Image Division.

Example: A logic AND is performed on two images, suppose the two
corresponding pixel values are (111)
10
is one image and (88)
10
(111)
in the second
image. The corresponding bit strings are:
10
01101111
AND
2

(88)
10
01011000
01001000
2

The logic operation AND and OR are used to combine the information in
two images. They may be done for special effects, but a more useful
application for image analysis is to perform a masking operation. Use AND
and OR as asimple method to extract a Region of Interest from an image, if
more sophisticated graphical methods are not available.
Example: A white square ANDed with an image will allow only the portion
of the image coincident with the square to appear in the output image with
the background turned black; and a black square ORd with an image will
allow only the part of the image corresponding to the black square to appear
a. Original image b. Image divided by value<1 c. Image divided by value >1

16
in the output image but will turn the rest of the image white. This process is
called image masking. The NOT operation creates a negative of the original
image, by inverting each bit within each pixel value.

Figure (2.7): Image masking.

a. Original image b. Image after NOT operation.
Figure (2.8): Complement Image.
a. Original image b. Image mask (AND))) c. ANDing a and b
d. Image mask (OR) e. ORing a and d

17
2.5 Image Restoration:

Image restoration methods are used to
improve the appearance of an image by application of a restoration process
that use mathematical model for image degradation.
Example of the type of degradation:
1. Blurring caused by motion or atmospheric disturbance.
2. Geometrics distortion caused by imperfect lenses.
3. Superimposed interface patterns caused by mechanical systems.
4. Noise from electronic source.

2.5.1
Noise is any undesired information that contaminates an image. Noise
appears in image from a variety of source. The digital image a acquisition
process, which converts an optical image into a continuous electrical
signal that is then sampled is the primary process by which noise appears
in digital images.
What is noise?
At every step in the process there are fluctuations () caused by
natural phenomena () that add a random value to exact brightness
value for a given pixel. In typical image the noise can be modeled with
one of the following distribution:
1. Gaussian (normal) distribution.
2. Uniform distribution.
3. Salt _and _pepper distribution.

18

Figure (2.9): Image Noise.
2.5.2
Spatial filtering is typically done for:
Noise Removal using Spatial Filters:
1. Remove various types of noise in digital images.
2. Perform some type of image enhancement.
[These filters are called spatial filter to distinguish them from frequency
domain filter].
The three types of filters are:
1. Mean filters

19
2. Median filters (order filter)
3. Enhancement filters
Mean and median filters are used primarily to conceal or remove noise,
although they may also be used for special applications. For instance, a
mean filter adds softer look to an image. The enhancement filter high
lights edges and details within the image.
Spatial filters are implemented with convolution masks. Because
convolution mask operation provides a result that is weighted sum of the
values of a pixel and its neighbours, it is called a linear filter.
Overall effects the convolution mask can be predicated based on the
general pattern. For example:
If the coefficients of the mask sum to one, the average brightness
of the image will be retained.
If the coefficients of the mask sum to zero, the average brightness
will be lost and will return a dark image.
If the coefficients of the mask are alternatively positive and
negative, the mask is a filter that returns edge information only.
If the coefficients of the mask are all positive, it is a filter that will
blur the image.
The mean filters, are essentially averaging filter. They operate on local
groups of pixel called neighbourhoods and replace the centre pixel with
an average of the pixels in this neighbourhood. This replacement is done
with a convolution mask such as the following 3X3 mask
Arithmetic mean filter smoothing or low-pass filter.
1 9 1 9 1 9
1 9 1 9 1 9
1 9 1 9 1 9

20
Not that the coefficient of this mask sum to one, so the image brightness
will be retained , and the coefficients are all positive , so it will tend to
blur the image . This type of mean filter smoothes out local variations
within an image, so it essentially a low pass filter. So a low filter can be
used to attenuate image noise that is composed primarily of high
frequencies components.

Figure (2.10): Mean Filter.
The median filter is a non linear filter (order filter). These filters are
based on as specific type of image statistics called order statistics.
Typically, these filters operate on small sub image, Window, and
replace the centre pixel value (similar to the convolution process).
Order statistics is a technique that arranges the entire pixel in sequential
order, given an NXN window (W) the pixel values can be ordered from
smallest to the largest.
I
1
I
2
I
3
...< I
Where I
N

1
, I
2
, I
3
..., I
N
are the intensity values of the subset of pixels
in the image.
a. Original image
b. Mean filtered image

21

Figure (2.11): Median Filter
Given the following 3X3 neighborhood
Example:
5 5 6
3 4 5
3 4 7
We first sort the value in order of size (3,3,4,4,5,5,5,6,7) ; then we select the
middle value , un this case it is 5. This 5 is then placed in centre location.
A median filter can use a neighbourhood of any size, but 3X3, 5X5 and 7X7
are typical. Note that the output image must be written to a separate image (a
buffer); so that the results are not corrupted as this process is performed.
(The median filtering operation is performed on an image by applying the
sliding window concepts, similar to what is done with convolution).
The window is overlaid on the upper left corner of the image, and the
median is determined. This value is put into the output image (buffer)
corresponding to the centre location of the window. The window is then
slide one pixel over, and the process is repeated.
a. Salt and pepper noise b. Median filtered image (3x3)

22
When the end of the row is reached, the window is slide back to the left side
of the image and down one row, and the process is repeated. This process
continues until the entire image has been processed.
Note that the outer rows and columns are not replaced. In practice this is
usually not a problem due to the fact that the images are much larger than
the masks. And these wasted rows and columns are often filled with zeros
(or cropped off the image). For example, with 3X3 mask, we lose one outer
row and column, a 5X5 mask we lose two rows and columns. This is not
visually significant for a typical 256X256 or 512X512 images.
The maximum and minimum filters are two order filters that can be used for
elimination of salt- and-pepper noise. The maximum filter selects the largest
value within an ordered window of pixels values; where as the minimum
filter selects the smallest value.
The minimum filters works best for salt- type noise (High value), and the
maximum filters work best for pepper-type noise.
In a manner similar to the median, minimum and maximum filter, order
filter can be defined to select a specific pixel rank within the ordered set. For
example we may find for certain type of pepper noise that selecting the
second highest values works better than selecting the maximum value. This
type of ordered selection is very sensitive to their type of images and their
use it is application specific. It should note that, in general a minimum or
low rank filter will tend to darken an image and a maximum or high rank
filter will tend to brighten an image.
The midpoint filter is actually both order and mean filter because it rely on
ordering the pixel values , but then calculated by an averaging process. This
midpoint filter is the average of the maximum and minimum within the
window as follows:

23
Order set = I
1
I
2
I
3
... I
N
2
Midpoint = (I
.
1
+I
N
2

)/2
The midpoint filter is most useful for Gaussian and uniform noise.

2.5.3 The Enhancement filter
The enhancement filters are:
:
1. Laplacian type.
2. Difference filter.
These filters will tend to bring out, or enhance details in the image.
Example of convolution masks for the Laplacian-type filters are:
0 -1 0 -1 -1 -1 -2 1 -2
-1 5 -1 -1 9 -1 1 5 1
0 -1 0 -1 -1 -1 -2 1 -2
The Laplacian type filters will enhance details in all directions equally.

Figure (2.12): Laplacian Filter.
a. Original image b. Laplacian filtered image

24
The difference filters will enhance details in the direction specific to the
mask selected. There are four different filter convolution masks,
corresponding to lines in the vertical, horizontal and two diagonal directions.
Vertical Horizontal
0 1 0 0 0 0
0 1 0 1 1 -1
0 -1 0 0 0 0

Diagonal 1 Diagonal 2
1 0 0 0 0 1
0 1 0 0 1 0
0 0 -1 -1 0 0

Figure (2.13): Difference Filter
a. Original image b. Difference filtered image
25
2.6 Image quantization
Image quantization is the process of reducing the image data by
removing some of the detail information by mapping group of data points to
a single point. This can be done by:
1. Gray_Level reduction (reduce pixel values themselves I(r, c).
2. Spatial reduction (reduce the spatial coordinate (r, c).
The simplest method of gray-level reduction is Thresholding. We select a
threshold gray _level and set every thing above that value equal to 1 and
every thing below the threshold equal to 0. This effectively turns a
gray_level image into abinary (two_level) image and is often used as
apreprocessing step in the extraction of object features, such as shape, area,
or perimeter.
A more versatile method of gray _level reduction is the process of taking the
data and reducing the number of bits per pixel. This can be done very
efficiency by masking the lower bits via an AND operation. Within this
method, the numbers of bits that are masked determine the number of gray
levels available.
Example:
We want to reduce 8_bit information containing 256 possible gray_level
values down to 32 possible values.
This can be done by ANDing each 8-bit value with the bit string 1111100.
this is equivalent to dividing by eight(2
3
), corresponding to the lower three
bits that we are masking and then shifting the result left three times. [Gray
_level in the image 0-7 are mapped to 0, gray_level in the range 8-15 are
mapped to 8 and so on].
26
We can see that by masking the lower three bits we reduce 256 gray levels to
32 gray levels:
256 8= 32
The general case requires us to mask k bits, where 2
k
is divided into the
original gray-level range to get the quantized range desired. Using this
method, we can reduce the number of gray levels to any power of 2: 2,4,6,8,
16, 32, 64 or 128.
Image quantization by masking to 128 gray level, this can be done by
ANDing each 8-bit value with bit string 11111110(2
1
).
Image quantization by masking to 64 gray_level. This can be done by
ANDing each 8-bit value with bit string 11111100(2
2
).
As the number of gray levels decreases, we can see increase in a
phenomenon called contouring.
Contouring appears in the image as false edges, or lines as a result of the
gray _level quantization method.
27
Figure ( 2-14): False Contouring
This false contouring effect can be visually improved upon by using an IGS
(improved gray-scale) quantization method. In this method (IGS) the
improvement will be by adding a small random number to each pixel before
quantization, which results in a more visually pleasing appearance.
Original 8-bit image,
256 gray levels
Quantized to 6 bits,
64 gray levels
8 gray levels
2 gray levels
28
Figure (2-15): IGS quantization
Original Image
IGS quantization
to 8 levels (3 bits)
Uniform quantization
to 8 levels (3 bits)
29
2.7 Edge Detection
Detecting edges is a basic operation in image processing. The edges of items
in an image hold much of the information in the image.
The edges tell you where:
items are.
their size.
shape
and something a bout their texture.
Edge detection methods are used as a first step in the line detection
processes, and they are used to find object boundaries by marking potential
edge points corresponding to place in an image where rapid changes in
brightness occur. After these edge points have been marked, they can be
merged to form lines and objects outlines.
Edge detection operations are based on the idea that edge information in an
image is found by looking at the relationship a pixel has with its neighbors.
If a pixel gray_level values similar to those a round it, there is probably not
an edge at that point at that point. However, if a pixel has neighbors with
widely varying gray levels, it may represent an edge point. In other words,
an edge is defined by a discontinuity in gray-level values. Ideally, an ege
separates two distinct objects. In practice, edges are caused by:
Change in color or texture or
Specific lighting conditions present during the image acquisition
process.
The following figure illustrates the difference between an ideal edge and a
real edge.
30
a. Ideal Edge b. Real Edge
Figure (2-16): Ideal vs. Real Edge
The vertical axis represents brightness, and the horizontal axis shows the
spatial coordinates. The abrupt change in brightness characterizes an ideal
edge. In the figure (b) we see the representation of real edge ,which change
gradually. This gradual change is a minor form of blurring caused by:
Imaging devices
The lenses
Or the lighting and it is typical for real world (as opposed to computer
_generated) images.
An edged is where the gray level of the image moves from an area of low
values to high values or vice versa. The edge itself is at the centre of this
transition. The detected edge gives a bright spot at edge and dark area every
Spatial Coordinate Spatial Coordinate
B
r
i
g
h
t
e
n
s
B
r
i
g
h
t
n
e
s
31
where else. This mean it is the slope or rate of change of the gray level in the
edge.
How do you calculate the derivative (the slop) of an image in all
direction?
Convolution of the image with masks is the most often used techniques of
doing this.
The idea is to take a 33 array of numbers and multiply it point by point
with 33 section of the image you sum the products and place the result in
the centre point of the image.
The question in this operation is how to choose the 33 mask?
There are several masks that amplify the slop of the edge. Take the simple
one-dimensional case and take as an example points on the ideal edge near
the edge. They could have values such as [3 5 7]. The slop through these
three points is (7-3)/2=2.if you convolve these three point with [-1 0 1] you
have -3+7=4.
The convolution amplified the slope, and the result is a large number at
the transition point in the edge.
There are two basic principles for each edge detector mask:
First: the number in the mask sum to zero. If 33 areas of an image
contains a constant value (such as all ones), then there are no edges in
that area. The result of convolving that area with a mask should be
zero. If the numbers in the mask sum to zero, then convolving the
mask with a constant area will result in the correct answer of zeros.
Second: the masks should approximate differentiation or amplify the
slope of the edge. The simple example [-1 0 1] given earlier showed
how to amplify the slope of the edge.
32
The number of masks used for edge detection is almost limitless. Research
have used different techniques to derive masks, some of will be illustrated in
the following section.
1- Sobel Operator: The Sobel edge detection masks look for edges in both
the horizontal and vertical directions and then combine this information into
a single metric. The masks are as follows:
Row Mask Column Mask
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
These masks are each convolved with the image. At each pixel location we
now have two numbers: S1, corresponding to the result form the row mask
and S2, from the column mask. We use this numbers to compute two
matrices, the edge magnitude and the edge direction, which are defined as
follows:
Edge Magnitude=
2
2
2
1
S S
Edge Direction = Tan
-1
2
1
S
S
33
2- Prewitt Operator: The Prewitt is similar to the Sobel but with different
mask coefficients. The masks are defined as follows:
Row Mask Column Mask
-1 -1 -1 -1 0 1
0 0 0 -1 0 1
1 1 1 -1 0 1
These masks are each convolved with the image. At each pixel location we
find two numbers: P1 corresponding to the result from the row mask and P2
from the column mask. We use these results to determine two metrics, the
edge magnitude and edge direction, which are defined as follows:
Edge Magnitude=
2
2
2
1
P P
Edge Direction = Tan
-1
2
1
P
P
3- Kirch Compass () Mask: the Kirch edge detection masks are called
compass mask s because they are defined by taking a single mask and
rotating it to the eight major compass orientations ():
north, north-east, east, south-east, south, south-west, and west and north-
west edges in an image. The masks are defined as follows:
34
-3 -3 5 -3 5 5 5 5 5 5 5 -3
-3 0 5 -3 0 5 -3 0 -3 5 0 -3
-3 -3 5 -3 -3 -3 -3 -3 -3 -3 -3 -3
K
0
K
1
K
3
K
4
5 -3 -3 -3 -3 -3 -3 -3 -3 -3 -3 -3
5 0 -3 5 0 -3 -3 0 -3 -3 0 5
5 -3 -3 5 5 -3 5 5 5 -3 5 5
K
4
K
5
K
6
K
7
The edge magnitude is defined as the maximum value found by the
convolution of each of the mask, with the image.
[Given a pixel, there are eight directions you can travel to a neighbouring
pixel (above, below , left ,right ,upper left, upper right, lower left, lower
right). Therefore there are eight possible directions for an edge. The
directional edge detectors can detect an edge in only one of the eight
directions. If you want to detect only left to right edges, you would use only
one of eight masks. If; however you want to detect all of the edges, you
would need to perform convolution over an image eight times using each of
the eight masks].
4- Robinson Compass Masks: the Robinson compass masks are used in a
manner similar to the Kirch masks but are easier to implement because they
rely only on coefficient of 0,1 and 2, and are symmetrical about their
directional axis-the axis with the zeros, we only need to compute the result
on four of the mask, the results. From the other four can be obtained by
negating the results form the first four. The masks are as follows:
35
-1 0 1 0 1 2 1 2 1 2 1 0
-2 0 2 -1 0 1 0 0 0 1 0 -1
-1 0 1 -2 -1 0 -1 -2 -1 0 -1 -2
R
0
R
1
R
3
R
4

1 0 -1 0 -1 -2 -1 - 2 -1 -2 -1 0
2 0 -2 1 0 -1 0 0 0 -1 0 1
1 0 -1 2 1 0 1 2 1 0 1 2
R
4
R
5
R
6
R
7
The edge magnitude is defined as the maximum value found by the
convolution of each of the masks with the image. The edge detection is
defined by the mask that produces the maximum magnitude.
Its interesting to note that masks R
0
and R
6 are
the same as the Sobel masks.
We can see that any of the edge detection masks can be extended by rotating
them in a manner like these compass masks which allow us to extract
explicit information about edge in any direction.
5- Laplacian Operators: the Laplacian operator described here are similar
to the ones used for pre-processing (as described in enhancement filter). The
three Laplacian masks that follow represent different approximation of the
Laplacian masks are rationally symmetric, which means edges at all
orientation contribute to the result. They are applied by selecting one mask
and convolving it with the image selecting one mask and convolving it with
the image.
36
Laplacian masks
0 -1 0 1 -2 1 -1 -1 -1
-1 4 -1 -2 4 -2 -1 8 -1
0 -1 0 1 -2 1 -1 -1 -1
These masks differ from the Laplacian type previously described in that the
centre coefficients have been decreased by one. So, if we are only interested
in edge information, the sum of the coefficients should be zero. If we want
to retain most of the information the coefficient should sum to a number
greater than zero. Consider an extreme example in which the center
coefficient value will depend most heavily up on the current value, with only
minimal contribution from surrounding pixel values.
6- Other Edge Detection Methods
Two other methods using Gaussian and homogeneity/difference operators
are given below:
0 0 -1 -1 -1 0 0
0 -2 -3 -3 -3 -2 0
-1 -3 5 5 5 -3 -1
-1 -3 5 16 5 -3 -1
0 0 -1 -1 -1 0 0
0 -2 -3 -3 -3 -2 0
-1 -3 5 5 5 -3 -1
77 Gaussian Mask to detect large edges
37
Gaussian edge detector has the advantage that the details in the output image
can be adjusted by varying the width of the convolution mask. A wider mask
eliminates small or fine edges and detects only large, significant edges.
Other than by masking, edge detection can also be performed by subtraction.
Two methods that use subtraction to detect the edge are Homogeneity
operator and Difference operator.
The homogeneity operator subtracts each of the pixels next to the centre of
the nn area ( where n is usually 3) from the centre pixel. The result is the
maximum of the absolute value of these subtractions. Subtraction in a
homogenous region produces zero and indicates an absence of edges. A high
maximum of the subtractions indicates an edge. This is a quick operator
since it performs only subtraction- eight operations per pixel and no
multiplication. This operator then requires thresholding. If there is no
thresholding then the resulting image looks like a faded copy of the original.
Generally thresholding at 30 to 50gives good result. The thresholding can be
varied depending upon the extent of edge detection desired.
The difference operator performs differentiation by calculating the
differences between the pixels that surround the centre pixel of an nn area.
This operator finds the absolute value of the difference between the opposite
pixels, the upper left minus the lower right, upper right minus the lower left,
left minus right, and top minus bottom. The result is the maximum absolute
value. as in the homogeneity case, this operator requires thresholding. But it
is quicker than the homogeneity operator since it uses four integer
subtractions as against eight subtractions in homogeneity operator per pixel.
38
Shown below is how the two operators detect the edge:
Consider an image block with centre pixel intensity 5,
1 2 3
4 5 6
7 8 9
Output of homogeneity operator is:
Max of {| 5-1 |, | 5-2 |, | 5-3 |, | 5-4 |, | 5-6 |, | 5-7 |, | 5-8 |, | 5-9 | } = 4
Output of difference operator is:
Max of {| 1-9 |, | 7-3 |, | 4-6 |, | 2-8 | } = 8
39
Chapter Three
Histogram
2.8 Histogram
The histogram of an image is a plot of the gray _levels values versus
the number of pixels at that value.
A histogram appears as a graph with "brightness" on the horizontal
axis from 0 to 255 (for an 8-bit) intensity scale) and "number of pixels "on
the vertical axis. For each colored image three histogram are computed, one
for each component (RGB, HSL).The histogram gives us a convenient -easy
-to -read representation of the concentration of pixels versus brightness of an
image, using this graph we able to see immediately:
1 Whether an image is basically dark or light and high or low
contrast.
2 Give us our first clues a bout what contrast enhancement would be
appropriately applied to make the image more subjectively pleasing
to an observer, or easier to interpret by succeeding image analysis
operations.
So the shape of histogram provide us with information about nature of the
image or sub image if we considering an object within the image. For
example:
1 Very narrow histogram implies a low-contrast image
2 Histogram skewed ( ) to word the high end implies a bright
image
3 Histogram with two major peaks , called bimodal, implies an object
that is in contrast with the background
Examples of the different types of histograms are shown in figure (2-17).
40
2.9 Histogram Modifications
The gray level histogram of an image is the distribution of the gray
level in an image is the distribution of the gray level in an image. The
histogram can be modified by mapping functions, which will stretch, shrink
Figure (2-17): Different types of Histogram
41
(compress), or slide the histogram. Figure (2-18) illustrates a graphical
representation of histogram stretch, shrink and slide.
Figure (2-18): Histogram Modifications.
The mapping function for histogram stretch can be found by the
following equation:
Stretch (I (r, c)) =
min
) , (
max
) , (
min
) , ( ) , (
c r I c r I
c r I c r I
[MAX-MIN] + MIN.
42
Where, I(r,c)
max
is the largest gray- level in the image I(r,c).
I(r,c)
min
is the smallest gray- level in the image I(r,c).
MAX and MIN correspond to the maximum and minimum gray
level values possible (for an 8-bit image these are 255 and 0).
This equation will take an image and stretch the histogram a cross the entire
gray-level range which has the effect of increasing the contrast of a low
contrast image (see figure (2-19) of histogram stretching).

Figure (2-19): Histogram Stretching.
Image after histogram stretching Histogram of image after stretching
Low-contrast image Histogram of low-contrast image
43
In most of the pixel values in an image fall within small range, but a few
outlines force the histogram to span the entire range, a pure histogram
stretch will not improve the image. In this case it is useful to allow a small
proceeding of the pixel values to be aliped at the low and high end of the
range (for an 8-bit image this means truncating at 0 and 255). See figure
(2.20) of stretched and clipped histogram).
Original Image Histogram of the original image
Image after histogram stretching
without clipping
Hist ogram of the image
44
Image after histogram
stretching with clipping 3% low
and high value
Histogram of the image
Figurer (2-20): Histogram Stretching (Clipping).
The opposite of a histogram stretch is a histogram shrink, which will
decrease image contrast by compressing the gray levels. The mapping
function for a histogram shrinking can be found by the following
equation:
Shrink (I(r,c)) =
min
) , (
max
) , (
min
max
c r I c r I
Shrink Shrink
[ I(r,c)-I(r,c)
min
]+ Shrink
min
Shrink
max
and shrink
min
correspond to the maximum and minimum desired
in the compressed histogram. In general, this process produces an image of
reduced contrast and may not seem to be useful an image enhancement (see
figure (2-21) of shrink histogram).
45
Original image Histogram of original image
Image after histogram shrink
to the range [75, 175]
Histogram of the image
Figurer (2.20): Histogram Shrinking.
The histogram slide techniques can be used to make an image either
darker or lighter but retain the relationship between gray-level values.
This can be a accomplished by simply adding or subtracting a fixed
number for all the gray-level values, as follows:
Slide (I(r,c)) = I (r,c)+ OFFSET.
Where OFFSET values is the amount to slide the histogram.
In this equation, a positive OFFSET value will increase the overall
brightness; where as a negative OFFSET will create a darker image, figure
(2-22) shows histogram sliding.
46
I mage after positive-val ue
hist ogram sliding
Histogram of image after slidi ng
Figurer (2.21): Histogram Sliding.
2. 10 Histogram Equalization
Is a popular technique for improving the appearance of a poor image.
It's a function is similar to that of a histogram stretch but often provides
more visually pleasing results a cross a wide rang of images.
Histogram equalization is a technique where the histogram of the resultant
image is as flat as possible (with histogram stretching the overall shape of
the histogram remains the same).
47
The results in a histogram with a mountain grouped closely together to
"spreading or flatting histogram makes the dark pixels appear darker and the
light pixels appear lighter (the key word is "appear" the dark pixels in a
photograph can not by any darker. If, however, the pixels that are only
slightly lighter become much lighter, then the dark pixels will appear
darker).
The histogram equalization process for digital images consists of four steps:
1. Find the running sum of the histogram values
2. Normalize the values from step1 by dividing by total number of
pixels.
3. Multiply the values from step2 by the maximum gray level value and
round.
4. Map the gray-level values to the results from step 3, using a one-to-
one correspondence. The following example will help to clarify this
process.
Example:-
We have an image with 3 bit /pixel, so the possible range of values is 0 to 7.
We have an image with the following histogram:
Gray-level value 0 1 2 3 4 5 6 7
No of Pixel
Histogram value
10 8 9 2 14 1 5 2
48
Step 1: Great a running sum of histogram values. This means that the first
values is 10, the second is 10+8=18, next is 10+8+9=27, and
soon. Here we get 10,18,29,43,44,49,51.
Step 2: Normalize by dividing by total number of pixels. The total
number of pixels is 10+8+9+2+14+1+5+0=51.
Step 3 : Multiply these values by the maximum gray level values in this
case 7 , and then round the result to the closet integer. After this
is done we obtain 1,2,4,4,6,6,7,7.
Step 4 : Map the original values to the results from step3 by a one to-
one correspondence.
The first three steps:
The fourth step:
Old 0 1 2 3 4 5 6 7
New 1 2 4 4 6 6 7 7
All pixel in the original image with gray level 0 are set to 1, values of 1 are
set to 2, 2 set to 4, 3 set to 4, and so on (see figure (2-21)) histogram
Gray-level 0 1 2 3 4 5 6
No. of Pixel 10 8 9 2 14 1 5
Run Sum 10 18 27 29 43 44 49
Normalized 10/51 18/51 27/51 29/51 43/51 44/51 49/51
Multiply by 7
1 2 4 4 6 6 7
49
equalization, you can see the original histogram and the resulting histogram
equalized histogram. Although the result is not flat, it is closer to being flat
than the original.
Image after histogram
equalization
Histogram after equalization
Figurer (2.21): Histogram Equalization.
Histogram features
The histogram features that we are considered are statically based
features where the histogram is used as a model of the probability
distribution of the gray levels. These statistical features provide us with
information a bout the characteristic of the gray level distribution for the
image or sub image. We define the first order histogram probability
P(a) as :
M
g N
g P
) (
) (
M is the number of pixels in the image or sub image (if the entire image is
under consideration, then M= N
2
for NN), and N (g) is the number of
pixels at gray level g. as with any probability distribution, all values for P (g)
are less than or equal to 1, histogram probability are mean, standard
deviation, skew, energy and entropy.
1. Mean: the mean is the average value, so it tells us something about
the general brightness of the image. A bright image will have a high
mean, and a dark image will have a low mean. We will use L as the
total number of gray levels available, so the gray levels range from 0
to L_1. For example, for typical 8-bit image data, L is 256 and ranges
from 0 to 255. We can define the mean as follows:

r c
L
g
M
c r I
g gP g
) , (
) (
1
0
If we use the second form of the equation, we sum over the rows and
columns corresponding to the pixels in the image or sub image under
consideration.
2. Standard deviation: Which is also known as the square root of
the variance, tell us something about the contrast. It describe the
spread in the data, so a high contrast image will have a high variance,
and a low contrast image will have a low variance. It is defined as
follows:
) ( ) (
2
1
0
g P g g
L
g

3. Skew :the skew measure the asymmetry a bout the mean in the gray-
level distribution .it is defined as:
g
e g
Skew
mod
This method of measuring skew is more computationally efficient,

especially considering that, typically, the mean and standard deviation have
already been calculated.
4. Energy :The energy measure tell us something a bout how the gray
level are distributed
Energy =
1
0
2
)] ( [
L
g
g P
The energy measure has a maximum value of 1 for an image with a constant
value and gets increasingly smaller as the pixel values are distributed a cross
more gray level values(remember that al the P(g) values are less than or
equal to 1).the lager this value is, the easier it is to compress the image data.
If the energy is high, it tells us that the number of gray levels in the image is
few, that is, the distribution is concentrated in only a small number of
different gray levels.
5. Entropy: the entropy is a measure that tells us how many bits we
need to code the image data and given by :
)] ( [ log ) (
1
0
2
g P g P Entropy
L
g

As the pixel values in the image are distributed among more gray levels, the
entropy increases. This measure tends to vary inversely with the energy.
Basic Relationships between Pixels
In this section, we consider several important relationships between
pixels in a digital image. As mentioned before, an image is denoted by
f(x, y).When referring in this section to a particular pixel, we use
lowercase letters, such as p and q.
1- Neighbors of a Pixel
A pixel p at coordinates (x, y) has four horizontal and vertical
neighbors whose coordinates are given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each
pixel is a unit distance from (x, y), and some of the neighbors of p lie
outside the digital image if (x, y) is on the border of the image.
The four diagonal neighbors of p have coordinates
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
and are denoted by ND(p). These points, together with the 4-neighbors,
are called the 8-neighbors of p, denoted by N8(p). As before, some of the
points in ND(p) and N8(p) fall outside the image if (x, y) is on the border
of the image.
2- Adjacency, Connectivity, Regions, and Boundaries
Connectivity between pixels is a fundamental concept that simplifies
the definition of numerous digital image concepts, such as regions and
boundaries. To establish if two pixels are connected, it must be
determined if they are neighbors and if their gray levels satisfy a specified
criterion of similarity (say, if their gray levels are equal). For instance, in
a binary image with values 0 and 1, two pixels may be 4-neighbors, but
they are said to be connected only if they have the same value.
Let V be the set of gray-level values used to define adjacency. In a binary
image, V={1} if we are referring to adjacency of pixels with value 1. In a
grayscale image, the idea is the same, but set V typically contains more
elements. For example, in the adjacency of pixels with a range of possible
gray-level values 0 to 255, set V could be any subset of these 256 values.
We consider three types of adjacency:
(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if
q is in the set N4(p).
(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if
(c) m-adjacency (mixed adjacency).Two pixels p and q with values from
V are m-adjacent if
(i) q is in N4(p), or
(ii) q is in ND(p) and the set has no pixels whose values are from V.
1
Chapter Three
Image Compression
3.1 Introduction
Image compression involves reducing the size of image data file,
while is retaining necessary information, the reduced file is called the
compressed file and is used to reconstruct the image, resulting in the
decompressed image. The original image, before any compression is
performed, is called the uncompressed image file. The ratio of the original,
uncompressed image file and the compressed file is referred to as the
compression ratio.
C
U
Size
Size
Size File Compressed
Size File ed Uncompress
Ratio n Compressio
It is often written as Size
U
: Size
C
Example: the original image is 256256 pixel, single band (gray scale), 8-
bit per pixel. This file is 65,536 bytes (64K). After compression the image
file is 6,554 byte. The compression ratio is:
Size
U
: Size
C = 10 999 . 9
6554
65536

this can be written as 10:1
This is called a 10 to 1 compression or a 10 times compression, or it can
be stated as compressing the image to 1/10 original size.
Another way to state the compression is to use the terminology of bits per
pixel. For an NN image
Bit per Pixel =
N N
Bytes of Number
Pixels of Number
Bites of Number
) ( 8
2
Example: using preceding example, with a compression ratio of
65536/6554 bytes. We want to express this as bits per pixel. This is done by
first finding the number of pixels in the image.
256256 = 65,536
We then find the number of bits in the compressed image file
6554 (8 bit/byte) =52432 bits.
Now, we can find the bits per pixel by taking the ration
52432/65536=0.8 bit/ Pixel.
The reduction in file size is necessary to meet
1. The bandwidth ( ) requirements for many transmission
systems.
2. The storage requirements in computer data base.
The amount of data required for digital images is enormous. For example, a
single 512512, 8-bit image requires 2,097,152 bits for storage. If we
wanted to transmission this image over the World Wide Web, it would
probably take minutes for transmission too long for most people to wait.
512 5128= 2,097,152.
Example
To transmit a digitized color scanned at 3,0002,000 pixels, and 24 bits, at
28.8(kilobits/second), it would take about
. minutes 81
Second 4883
) second / bits 1024 8 . 28 (
) s bits/pixel pixels)(24 2000 3000 (
Couple this result with transmitting multiple image or motion images, and
the necessity of image compression can be appreciated.
3
The key to a successful compression schema comes with the second part of
the definition retaining necessary information. To understand this we must
differentiate between data and information.
For digital images, data refer to pixel gray-level values the correspond to the
brightness of a pixel at a point in space. Information is interpretation of the
data in a meaningful way. Data are used to convey information, much like
the way the alphabet is used to convey information via words. Information is
an elusive concept, it can be application specific. For example, in a binary
image that contains text only, the necessary information may only involve
the text being readable, whereas for a medical image the necessary
information may be every minute detail in the original image.
There are two primary types of images compression methods and they are:
1. Lossless Compression
2. Lossy Compression.
This compression is called lossless because no data are lost, and the
original image can be recreated exactly from the compressed data. For
simple image such as text-only images.
These compression methods are called Lossy because they allow a loss
because they allow a loss in actual image data, so original uncompressed
image can not be created exactly from the compressed file. For complex
images these techniques can achieve compression ratios of 100 0r 200
and still retain in high quality visual information. For simple image or
lower-quality results compression ratios as high as 100 to 200 can be
attained.
4
3.1 Compression System Model
The compression system model consists of two parts: Compressor and
Decompressor.
Compressor: consists of preprocessing stage and encoding stage.
Decompressor: consists of decoding stage followed by a post processing
stage, as following figure:
Figure (3.1): Compression System Model.
Before encoding, preprocessing is performed to prepare the image for the
encoding process, and consists of any number of operations that are
application specific. After the compressed file has been decoded, post
processing can be performed to eliminate some of the undesirable
artifacts brought about by the compression process. Often, many practical
compression algorithms are a combination of a number of different
individual compression techniques.
Input Image
I(r,c)
Compressed
File
Input Image
I(r,c)
Input Image
I(r,c)
Preprocessing Encoding
a. Compression
b. Decompression
5
3.2 Lossles Compression Method
Lossless compression methods are necessary in some imaging
applications. For example with medical image, the law requires that any
archived medical images are stored without any data loss. Many of the
lossless techniques were developed for non image data and,
consequently, are not optimal for image compression. In general, the
lossless techniques a lone provides compression in the range of only a
10% reduction in file size. However lossless compression techniques may
be used for both preprocessing and postprocessing in image compression
algorithms. Additionally, for simple image the lossless techniques can
provide subsational compression.
An important concepts here is the ides of measuring the average
information in an image, referred to as entropy. The entropy for NN
image can be calculated by this equation.
) ( log
2
1
0
i
L
i
i
P P Entropy

(in bits per pixel)

Where
P
i
= The probability of the ith gray level
2
N
n
k
n
k
= the total number of pixels with gray value k.
L= the total number of gray levels (e.g. 256 for 8-bits)
Example
Let L=8, meaning that there are 3 bits/ pixel in the original image. Let that
number of pixel at each gray level value is equal (they have the same
probability) that is:
6
8
1
7 2 1 0
P P P P
Now, we can calculate the entropy as follows:

7
0
log(
i
i i
P P Entropy
3 )
8
1
( log
8
1
2
7
0 i

This tell us that the theoretical minimum for lossless coding for this image is
8 bit/pixel
Note/ Log
2
(x) can be found by taking log
10
and multiplying by 3.33
1
Chapter Three
Image Compression
3.1 Introduction
Image compression involves reducing the size of image data file,
while is retaining necessary information, the reduced file is called the
compressed file and is used to reconstruct the image, resulting in the
decompressed image. The original image, before any compression is
performed, is called the uncompressed image file. The ratio of the original,
uncompressed image file and the compressed file is referred to as the
compression ratio.
C
U
Size
Size
Size File Compressed
Size File ed Uncompress
Ratio n Compressio
It is often written as Size
U
: Size
C
Example: the original image is 256256 pixel, single band (gray scale), 8-
bit per pixel. This file is 65,536 bytes (64K). After compression the image
file is 6,554 byte. The compression ratio is:
Size
U
: Size
C = 10 999 . 9
6554
65536

this can be written as 10:1
This is called a 10 to 1 compression or a 10 times compression, or it can
be stated as compressing the image to 1/10 original size.
Another way to state the compression is to use the terminology of bits per
pixel. For an NN image
Bit per Pixel =
N N
Bytes of Number
Pixels of Number
Bites of Number
) ( 8
2
Example: using preceding example, with a compression ratio of
65536/6554 bytes. We want to express this as bits per pixel. This is done by
first finding the number of pixels in the image.
256256 = 65,536
We then find the number of bits in the compressed image file
6554 (8 bit/byte) =52432 bits.
Now, we can find the bits per pixel by taking the ration
52432/65536=0.8 bit/ Pixel.
The reduction in file size is necessary to meet
1. The bandwidth ( ) requirements for many transmission
systems.
2. The storage requirements in computer data base.
The amount of data required for digital images is enormous. For example, a
single 512512, 8-bit image requires 2,097,152 bits for storage. If we
wanted to transmission this image over the World Wide Web, it would
probably take minutes for transmission too long for most people to wait.
512 5128= 2,097,152.
Example
To transmit a digitized color scanned at 3,0002,000 pixels, and 24 bits, at
28.8(kilobits/second), it would take about
. minutes 81
Second 4883
) second / bits 1024 8 . 28 (
) s bits/pixel pixels)(24 2000 3000 (
Couple this result with transmitting multiple image or motion images, and
the necessity of image compression can be appreciated.
3
The key to a successful compression schema comes with the second part of
the definition retaining necessary information. To understand this we must
differentiate between data and information.
For digital images, data refer to pixel gray-level values the correspond to the
brightness of a pixel at a point in space. Information is interpretation of the
data in a meaningful way. Data are used to convey information, much like
the way the alphabet is used to convey information via words. Information is
an elusive concept, it can be application specific. For example, in a binary
image that contains text only, the necessary information may only involve
the text being readable, whereas for a medical image the necessary
information may be every minute detail in the original image.
There are two primary types of images compression methods and they are:
This compression is called lossless because no data are lost, and the
original image can be recreated exactly from the compressed data. For
simple image such as text-only images.
These compression methods are called Lossy because they allow a loss
because they allow a loss in actual image data, so original uncompressed
image can not be created exactly from the compressed file. For complex
images these techniques can achieve compression ratios of 100 0r 200
and still retain in high quality visual information. For simple image or
lower-quality results compression ratios as high as 100 to 200 can be
attained.
4
3.1 Compression System Model
The compression system model consists of two parts: Compressor and
Decompressor.
Compressor: consists of preprocessing stage and encoding stage.
Decompressor: consists of decoding stage followed by a post processing
stage, as following figure:
Figure (3.1): Compression System Model.
Before encoding, preprocessing is performed to prepare the image for the
encoding process, and consists of any number of operations that are
application specific. After the compressed file has been decoded, post
processing can be performed to eliminate some of the undesirable
artifacts brought about by the compression process. Often, many practical
compression algorithms are a combination of a number of different
individual compression techniques.
Input Image
I(r,c)
Compressed
File
Input Image
I(r,c)
Input Image
I(r,c)
a. Compression
b. Decompression
5
3.2 Lossles Compression Method
Lossless compression methods are necessary in some imaging
applications. For example with medical image, the law requires that any
archived medical images are stored without any data loss. Many of the
lossless techniques were developed for non image data and,
consequently, are not optimal for image compression. In general, the
lossless techniques a lone provides compression in the range of only a
10% reduction in file size. However lossless compression techniques may
be used for both preprocessing and postprocessing in image compression
algorithms. Additionally, for simple image the lossless techniques can
provide subsational compression.
An important concepts here is the ides of measuring the average
information in an image, referred to as entropy. The entropy for NN
image can be calculated by this equation.
) ( log
2
1
0
i
L
i
i
P P Entropy

(in bits per pixel)

Where
P
i
= The probability of the ith gray level
2
N
n
k
n
k
= the total number of pixels with gray value k.
L= the total number of gray levels (e.g. 256 for 8-bits)
Example
Let L=8, meaning that there are 3 bits/ pixel in the original image. Let that
number of pixel at each gray level value is equal (they have the same
probability) that is:
6
8
1
7 2 1 0
P P P P
Now, we can calculate the entropy as follows:

7
0
log(
i
i i
P P Entropy
3 )
8
1
( log
8
1
2
7
0 i

This tell us that the theoretical minimum for lossless coding for this image is
8 bit/pixel
Note/ Log
2
(x) can be found by taking log
10
and multiplying by 3.33
- 1 -
Specifying Colors
Any color that can be represented on a computer monitor is
specified by means of the three basic colors- Red, Green and Blue called
the RGB colors. By mixing appropriate percentages of these basic colors,
one can design almost any color one ever imagines.
The model of designing colors based on the intensities of their RGB
components is called the RGB model, and its a fundamental concept in
computer graphics. Each color, therefore, is represented by a triplet (Red,
Green, Blue), in which red, green and blue are three bytes that represent
the basic color components. The smallest value, 0, indicates the absence
of color. The largest value, 255, indicates full intensity or saturation. The
triplet (0, 0, 0) is black, because all colors are missing, and the triplet
(255, 255, 255) is white. Other colors have various combinations:
( 255,0,0 ) is pure red, ( 0,255,255 ) is a pure cyan ( what one gets when
green and blue are mixed ), and ( 0,128,128 ) is a mid-cyan ( a mix of
mid-green and mid-blue tones ). The possible combinations of the three
basic color components are 256x256x256, or 16,777,216 colors

- 2 -
The process of generating colors with three basic components is based on
the RGB Color cube as shown in the above figure. The three dimensions
of the color cube correspond to the three basic colors. The cube's corners
are assigned each of the three primary colors, their complements, and the
colors black and white. Complementary colors are easily calculated by
subtracting the Color values from 255. For example, the color (0, 0,255)
is a pure blue tone. Its complementary color is (255-0,255-0,255-255), or
(255, 255, 0), which is a pure yellow tone. Blue and Yellow are
complementary colors, and they are mapped to opposite corners of the
cube. The same is true for red and cyan, green and magenta, and black
and white. Adding a color to its complement gives white.
It is noticed that the components of the colors at the corners of the cube
have either zero or full intensity. As we move from one corner to another
along the same edge of the cube, only one of its components changes
value. For example, as we move from the Green to the Yellow corner, the
Red component changes from 0 to 255. The other two components
remain the same. As we move between these two corners, we get all the
available tones from Green to Yellow (256 in all). Similarly, as one
moves from the Yellow to the Red corner, the only component that
changes is the Green, and one gets all the available shades from Yellow
to Red. This range of similar colors is called gradient.
Although we can specify more than 16 million colors, we cant have
more than 256 shades of Gray. The reason is that a gray tone, including
the two extremes (black and white), is made up of equal values of all the
three primary colors. This is seen from the RGB cube as well. Gray
shades lie on the cubes diagonal that goes from black to white. As we
move along this path, all three basic components change value, but they
are always equal. The value (128,128,128) is a mid-gray tone, but the
values (129,128,128) arent gray tones, although they are too close for the
- 3 -
human eye to distinguish. That's why it is wasteful to store grayscale
pictures using 16-million color True Color file formats. A 256-color file
format stores a grayscale just as accurately and more compactly. Once an
image is known in grayscale, we neednt store all three bytes per pixel.
One value is adequate (the other two components have the same value).
1- HSV Color Format
HSV color space HSL or HIS is one color space, which describes
colors as perceived by human beings. HSI (or HSV) stands for hue (H),
(S) saturation and intensity (I) (or value V). For example, a blue car
reflects blue hue. Moreover is also attributing of the human perception.
The hue which is essentially the chromatic component of our perception
may again be considered as weak hue or strong hue. The colorfulness of a
color is described by the saturation component. For example, the color
from a single monochromatic source of light, which produce colors of a
single wavelength only , is highly saturated , while the colors comprising
hues of different wavelengths have little chroma and have less saturation.
The gray colors do not have any hue and hence they have less saturation
or unsaturated. Saturation is thus a measure of colorfulness or whiteness
in the color perceived.
The lightness (L) or intensity (I) or value (V) essentially provides a
measure of the brightness of colors. This gives a measure of how much
light is reflected from the object or how much light is emitted from a
region.
Figure : HSL and HSV
The HSV image may be computed from RGB image using
different transformation. Some of them are as follows:
The simplest form
H= tan [3(G-B)/(R-G) +(R
S= 1-(min(R, G, B)/V)
V= (R+G+B/3)
However, the hue (H) becomes undefined when saturation S=0
The most popular form of HSV transformation is shown next,
where the r,g,b values are first obtained by normalizing each pixel
such that
r=(R/(R+G+B)), g=(G/(R+G+B)),b=(B/(R+G+B))
Accordingly the H, S and V value can be computed as V=max (r, g, b),
=
u,

- 4 -
Figure : HSL and HSV Color Space
different transformation. Some of them are as follows:
The simplest form of HSV transformation is :
G) +(R-B)]
the hue (H) becomes undefined when saturation S=0
g=(G/(R+G+B)),b=(B/(R+G+B))
= u
min(, , ) , > u

- 5 -
=
u, = u
6u |( ) ] =
6u|2 + (( ) ) ] =
6u|4 +(( ) ) ] =

H=H+360 if H<0
The HSV color space is compatible with human color perception.
2-YCbCr Color Format
Another color space in which luminance and chrominance are separately
represented is the YCbCr. The Y component takes values from 16 to 235,
while Cb and Cr take values from 16 to 240. They are obtained from
gamma-corrected R, G, B values as follows:

=
u.299u.S87u.114
u.169 u.SS1u.Suu
u.Suu u.419 u.u81

Basic Relationships between Pixels

In this section, we consider several important relationships between
pixels in a digital image. As mentioned before, an image is denoted by
f(x, y).When referring in this section to a particular pixel, we use
lowercase letters, such as p and q.
1- Neighbors of a Pixel
A pixel p at coordinates (x, y) has four horizontal and vertical
neighbors whose coordinates are given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each
pixel is a unit distance from (x, y), and some of the neighbors of p lie
outside the digital image if (x, y) is on the border of the image.
The four diagonal neighbors of p have coordinates
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
- 6 -
and are denoted by ND(p). These points, together with the 4-neighbors,
are called the 8-neighbors of p, denoted by N8 (p). As before, some of the
points in ND(p) and N8(p) fall outside the image if (x, y) is on the border
of the image.
2- Adjacency, Connectivity, Regions, and Boundaries
Connectivity between pixels is a fundamental concept that simplifies
the definition of numerous digital image concepts, such as regions and
boundaries. To establish if two pixels are connected, it must be
determined if they are neighbors and if their gray levels satisfy a specified
criterion of similarity (say, if their gray levels are equal). For instance, in
a binary image with values 0 and 1, two pixels may be 4-neighbors, but
they are said to be connected only if they have the same value.
Let V be the set of gray-level values used to define adjacency. In a binary
Image, V={1} if we are referring to adjacency of pixels with value 1. In a
grayscale image, the idea is the same, but set V typically contains more
elements. For example, in the adjacency of pixels with a range of possible
gray-level values 0 to 255, set V could be any subset of these 256 values.
We consider three types of adjacency:
(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if
(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if
(c) m-adjacency (mixed adjacency).Two pixels p and q with values from
V are m-adjacent if
(i) q is in N4(p), or
(ii) q is in ND(p) and the set has no pixels whose values are from V.

4 Ip

Uploaded by

Copyright:

Available Formats

4 Ip

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4 Ip

Uploaded by

Copyright:

Available Formats

Save from: www.uotechnology.edu.

This method of measuring skew is more computationally efficient,

(in bits per pixel)

(in bits per pixel)

Basic Relationships between Pixels

You might also like