TCS 707-SM02
TCS 707-SM02
TCS 707-SM02
UNIT-I
2
Digital Image Processing
1. What is meant by Digital Image Processing? Explain how digital images can
be represented?
An image may be defined as a two-dimensional function, f(x, y), where x and y are spatial
(plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity
or gray level of the image at that point. When x, y, and the amplitude values of f are all finite,
discrete quantities, we call the image a digital image. The field of digital image processing refers
to processing digital images by means of a digital computer. Note that a digital image is
composed of a finite number of elements, each of which has a particular location and value.
These elements are referred to as picture elements, image elements, pels, and pixels. Pixel is the
term most widely used to denote the elements of a digital image.
Vision is the most advanced of our senses, so it is not surprising that images play
the single most important role in human perception. However, unlike humans, who are limited to
the visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire
EM spectrum, ranging from gamma to radio waves. They can operate on images generated by
sources that humans are not accustomed to associating with images. These include ultra-sound,
electron microscopy, and computer-generated images. Thus, digital image processing
encompasses a wide and varied field of applications. There is no general agreement among
authors regarding where image processing stops and other related areas, such as image analysis
and computer vision, start. Sometimes a distinction is made by defining image processing as a
discipline in which both the input and output of a process are images. We believe this to be a
limiting and somewhat artificial boundary. For example, under this definition, even the trivial
task of computing the average intensity of an image (which yields a single number) would not be
considered an image processing operation. On the other hand, there are fields such as computer
vision whose ultimate goal is to use computers to emulate human vision, including learning and
being able to make inferences and take actions based on visual inputs. This area itself is a branch
of artificial intelligence (AI) whose objective is to emulate human intelligence. The field of AI is
in its earliest stages of infancy in terms of development, with progress having been much slower
than originally anticipated. The area of image analysis (also called image understanding) is in
between image processing and computer vision.
images, but its outputs are attributes extracted from those images (e.g., edges, contours, and the
identity of individual objects). Finally, higher-level processing involves ―making sense‖ of an
ensemble of recognized objects, as in image analysis, and, at the far end of the continuum,
performing the cognitive functions normally associated with vision and, in addition,
encompasses processes that extract attributes from images, up to and including the recognition of
individual objects. As a simple illustration to clarify these concepts, consider the area of
automated analysis of text. The processes of acquiring an image of the area containing the text,
preprocessing that image, extracting (segmenting) the individual characters, describing the
characters in a form suitable for computer processing, and recognizing those individual
characters are in the scope of what we call digital image processing.
We will use two principal ways to represent digital images. Assume that an image f(x, y) is
sampled so that the resulting digital image has M rows and N columns. The values of the
coordinates (x, y) now become discrete quantities. For notational clarity and convenience, we
shall use integer values for these discrete coordinates. Thus, the values of the coordinates at the
origin are (x, y) = (0, 0). The next coordinate values along the first row of the image are
represented as (x, y) = (0, 1). It is important to keep in mind that the notation (0, 1) is used to
signify the second sample along the first row. It does not mean that these are the actual values of
physical coordinates when the image was sampled. Figure 1 shows the coordinate convention
used.
4
Digital Image Processing
The notation introduced in the preceding paragraph allows us to write the complete M*N digital
image in the following compact matrix form:
The right side of this equation is by definition a digital image. Each element of this matrix array
is called an image element, picture element, pixel, or pel.
Image acquisition is the first process shown in Fig.2. Note that acquisition could be as simple as
being given an image that is already in digital form. Generally, the image acquisition stage
involves preprocessing, such as scaling.
Image enhancement is among the simplest and most appealing areas of digital image processing.
Basically, the idea behind enhancement techniques is to bring out detail that is obscured, or
simply to highlight certain features of interest in an image. A familiar example of enhancement is
when we increase the contrast of an image because ―it looks better.‖ It is important to keep in
mind that enhancement is a very subjective area of image processing.
Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic models of image
degradation. Enhancement, on the other hand, is based on human subjective preferences
regarding what constitutes a ―good‖ enhancement result.
Color image processing is an area that has been gaining in importance because of the significant
increase in the use of digital images over the Internet.
5
Digital Image Processing
Wavelets are the foundation for representing images in various degrees of resolution.
Compression, as the name implies, deals with techniques for reducing the storage required to
save an image, or the bandwidth required to transmit it. Although storage technology has
improved significantly over the past decade, the same cannot be said for transmission capacity.
This is true particularly in uses of the Internet, which are characterized by significant pictorial
content. Image compression is familiar (perhaps inadvertently) to most users of computers in the
form of image file extensions, such as the jpg file extension used in the JPEG (Joint
Photographic Experts Group) image compression standard.
Morphological processing deals with tools for extracting image components that are useful in the
representation and description of shape.
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged
segmentation procedure brings the process a long way toward successful solution of imaging
problems that require objects to be identified individually. On the other hand, weak or erratic
segmentation algorithms almost always guarantee eventual failure. In general, the more accurate
the segmentation, the more likely recognition is to succeed.
6
Digital Image Processing
Representation and description almost always follow the output of a segmentation stage, which
usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels
separating one image region from another) or all the points in the region itself. In either case,
converting the data to a form suitable for computer processing is necessary. The first decision
that must be made is whether the data should be represented as a boundary or as a complete
region. Boundary representation is appropriate when the focus is on external shape
characteristics, such as corners and inflections. Regional representation is appropriate when the
focus is on internal properties, such as texture or skeletal shape. In some applications, these
representations complement each other. Choosing a representation is only part of the solution for
transforming raw data into a form suitable for subsequent computer processing. A method must
also be specified for describing the data so that features of interest are highlighted. Description,
also called feature selection, deals with extracting attributes that result in some quantitative
information of interest or are basic for differentiating one class of objects from another.
Recognition is the process that assigns a label (e.g., ―vehicle‖) to an object based on its
descriptors. We conclude our coverage of digital image processing with the development of
methods for recognition of individual objects.
As recently as the mid-1980s, numerous models of image processing systems being sold
throughout the world were rather substantial peripheral devices that attached to equally
substantial host computers. Late in the 1980s and early in the 1990s, the market shifted to image
processing hardware in the form of single boards designed to be compatible with industry
standard buses and to fit into engineering workstation cabinets and personal computers. In
addition to lowering costs, this market shift also served as a catalyst for a significant number of
new companies whose specialty is the development of software written specifically for image
processing.
Although large-scale image processing systems still are being sold for massive
imaging applications, such as processing of satellite images, the trend continues toward
miniaturizing and blending of general-purpose small computers with specialized image
processing hardware. Figure 3 shows the basic components comprising a typical general-purpose
system used for digital image processing. The function of each component is discussed in the
following paragraphs, starting with image sensing.
With reference to sensing, two elements are required to acquire digital images. The first is a
physical device that is sensitive to the energy radiated by the object we wish to image. The
second, called a digitizer, is a device for converting the output of the physical sensing device into
7
Digital Image Processing
digital form. For instance, in a digital video camera, the sensors produce an electrical output
proportional to light intensity. The digitizer converts these outputs to digital data.
Specialized image processing hardware usually consists of the digitizer just mentioned, plus
hardware that performs other primitive operations, such as an arithmetic logic unit (ALU), which
performs arithmetic and logical operations in parallel on entire images. One example of how an
ALU is used is in averaging images as quickly as they are digitized, for the purpose of noise
reduction. This type of hardware sometimes is called a front-end subsystem, and its most
distinguishing characteristic is speed. In other words, this unit performs functions that require
fast data throughputs (e.g., digitizing and averaging video images at 30 framess) that the typical
main computer cannot handle.
The computer in an image processing system is a general-purpose computer and can range from
a PC to a supercomputer. In dedicated applications, some times specially designed computers are
used to achieve a required level of performance, but our interest here is on general-purpose
8
Digital Image Processing
image processing systems. In these systems, almost any well-equipped PC-type machine is
suitable for offline image processing tasks.
Software for image processing consists of specialized modules that perform specific tasks. A well-
designed package also includes the capability for the user to write code that, as a minimum,
utilizes the specialized modules. More sophisticated software packages allow the integration of
those modules and general-purpose software commands from at least one computer language.
Mass storage capability is a must in image processing applications. An image of size 1024*1024
pixels, in which the intensity of each pixel is an 8-bit quantity, requires one megabyte of storage
space if the image is not compressed. When dealing with thousands, or even millions, of images,
providing adequate storage in an image processing system can be a challenge. Digital storage for
image processing applications falls into three principal categories: (1) short-term storage for use
during processing, (2) on-line storage for relatively fast re-call, and (3) archival storage,
characterized by infrequent access. Storage is measured in bytes (eight bits), Kbytes (one
thousand bytes), Mbytes (one million bytes), Gbytes (meaning giga, or one billion, bytes), and
Tbytes (meaning tera, or one trillion, bytes). One method of providing short-term storage is
computer memory. Another is by specialized boards, called frame buffers, that store one or more
images and can be accessed rapidly, usually at video rates (e.g., at 30 complete images per
second).The latter method allows virtually instantaneous image zoom, as well as scroll (vertical
shifts) and pan (horizontal shifts). Frame buffers usually are housed in the specialized image
processing hardware unit shown in Fig.3.Online storage generally takes the form of magnetic
disks or optical-media storage. The key factor characterizing on-line storage is frequent access to
the stored data. Finally, archival storage is characterized by massive storage requirements but
infrequent need for access. Magnetic tapes and optical disks housed in ―jukeboxes‖ are the usual
media for archival applications.
Image displays in use today are mainly color (preferably flat screen) TV monitors. Monitors are
driven by the outputs of image and graphics display cards that are an integral part of the
computer system. Seldom are there requirements for image display applications that cannot be
met by display cards available commercially as part of the computer system. In some cases, it is
necessary to have stereo displays, and these are implemented in the form of headgear containing
two small displays embedded in goggles worn by the user.
Hardcopy devices for recording images include laser printers, film cameras, heat-sensitive
devices, inkjet units, and digital units, such as optical and CD-ROM disks. Film provides the
highest possible resolution, but paper is the obvious medium of choice for written material. For
presentations, images are displayed on film transparencies or in a digital medium if image
projection equipment is used. The latter approach is gaining acceptance as the standard for image
presentations.
9
Digital Image Processing
Networking is almost a default function in any computer system in use today. Because of the
large amount of data inherent in image processing applications, the key consideration in image
transmission is bandwidth. In dedicated networks, this typically is not a problem, but
communications with remote sites via the Internet are not always as efficient. Fortunately, this
situation is improving quickly as a result of optical fiber and other broadband technologies.
Although the digital image processing field is built on a foundation of mathematical and
probabilistic formulations, human intuition and analysis play a central role in the choice of one
technique versus another, and this choice often is made based on subjective, visual judgments.
Figure 4.1 shows a simplified horizontal cross section of the human eye. The eye is nearly a
sphere, with an average diameter of approximately 20 mm. Three membranes enclose the eye:
the cornea and sclera outer cover; the choroid; and the retina. The cornea is a tough, transparent
tissue that covers the anterior surface of the eye. Continuous with the cornea, the sclera is an
opaque membrane that encloses the remainder of the optic globe. The choroid lies directly below
the sclera. This membrane contains a network of blood vessels that serve as the major source of
nutrition to the eye. Even superficial injury to the choroid, often not deemed serious, can lead to
severe eye damage as a result of inflammation that restricts blood flow. The choroid coat is
heavily pigmented and hence helps to reduce the amount of extraneous light entering the eye and
the backscatter within the optical globe. At its anterior extreme, the choroid is divided into the
ciliary body and the iris diaphragm. The latter contracts or expands to control the amount of light
that enters the eye. The central opening of the iris (the pupil) varies in diameter from
approximately 2 to 8 mm. The front of the iris contains the visible pigment of the eye, whereas
the back contains a black pigment.
The lens is made up of concentric layers of fibrous cells and is suspended by fibers that attach to
the ciliary body. It contains 60 to 70%water, about 6%fat, and more protein than any other tissue
in the eye. The lens is colored by a slightly yellow pigmentation that increases with age. In
extreme cases, excessive clouding of the lens, caused by the affliction commonly referred to as
cataracts, can lead to poor color discrimination and loss of clear vision. The lens absorbs
approximately 8% of the visible light spectrum, with relatively higher absorption at shorter
wavelengths. Both infrared and ultraviolet light are absorbed appreciably by proteins within the
lens structure and, in excessive amounts, can damage the eye.
10
Digital Image Processing
The innermost membrane of the eye is the retina, which lines the inside of the wall’s entire
posterior portion. When the eye is properly focused, light from an object outside the eye is
imaged on the retina. Pattern vision is afforded by the distribution of discrete light receptors over
the surface of the retina. There are two classes of receptors: cones and rods. The cones in each
11