DIP Basics01
DIP Basics01
DIP Basics01
1
In Computer Graphics and Image Processing environment the video re-
fresh (row by row from top-left to bottom-right) is usually followed with the
origin at the top left of the screen and +x to the right and +y pointing
downwards! Matlab has an even a stranger convention for image arrays, by
storing it in the (y, x) instead of (x, y) order. So, for entry (i, j) in Matlab, i
represents the y-line and j the x-line.
2
The 2D environments of pixel-values and coefficients are serialized row-wise
into:
a row-vector (I(i − 1, j − 1), I(i − 1, j), . . . , I(i, j), . . . , I(i + 1, j), I(i +
1, j + 1)); and
Their inner product Iout is written to the corresponding middle position (i, j)
of the output image. Of course different serializations are possible here but
the only important thing is that it is done the same way for the local pixel
values and position-dependent weight factors.
3
The combination of two Roberts operators create a second derivative esti-
mator
with coefficients
horizontally or vertically (or diagonally) with weights
−1 2 −1 that are combined Laplace 3 × 3 operators like
0 −1 0 −1 −1 −1 1 −2 1
−1 4 −1 , −1 8 −1 , −2 4 −2 .
0 −1 0 −1 −1 −1 1 −2 1
Theory has it that zero crossings of this second derivative is an optimal seg-
mentation choice.
4
The 2 × 2 and 3 × 3 transformation matrices for 2D and 3D coordinate
transformations are also enlarged to 3 × 3 and 4 × 4 homogeneous transfor-
mation matrices for 2D and 3D transformations: Elementary homogeneous
transformations now look like this in 2D:
Translation
2D Scaling 2D Rotation 2D
1 0 dx sx 0 0 cos α − sin α 0
0 1 dy 0 sy 0 sin α cos α 0
0 0 1 0 0 1 0 0 1
Homogeneous coordinate transformations in 2D are applied as follows:
2D Input Homogeneous
Transformed
0 Normalised
0 2D Result
x x x /h 0
x y y 0 y 0 /h x /h
y y 0 /h
1 h 1
For instance: rotate an image 30◦ (degrees) around its center (30, 70). To do
this we have to concatenate three elementary operations: translate origin to
center, rotate around this new center, translate back to old origin position.
The concatenated transformation matrix C becomes:
1 0 30 cos 30◦ − sin 30◦ 0 1 0 −30
= sin 30◦ cos 30◦ (−30 sin 30◦ − 70 cos 30◦ + 70) .
0 0 1
Applied to a pixel position (x, y):
x x
x
→ y → C y
y
1 1
x cos 30◦ − y sin 30◦ − 30 cos 30◦ + 70 sin 30◦ + 30
5
4 Why Inverse Coordinate Transformations
are used with Images
If one would use a forward coordinate transformation to copy a pixels intensity
value(s) to the new position, it may well happen that, due to rotation and/or
scaling effects, not all resulting pixels will be completely filled in the output
image area: when input positions are mapped onto non-neighboring output
positions, intermediate pixels in the output image might not receive a copy
of any input pixel position. Thus, the output image may have small holes.
To prevent this from happening, in Image Processing, image coordinate
transformations are carried out inversely: each pixel position in the output
image is visited; using the inverse coordinate transformation, its input loca-
tion (not necessarily integer-valued) is located; since the input image consists
of a set of integer grid positions (row and column numbers) this real position
may end up either:
outside the input grid, in which case a background value is used; or
within an input grid cell with 4 intensity values at its corners, in which
case a so-called interpolation recipe is used to determine what value will
be used as the output value.
The easiest interpolation recipe is to discard the fractional value of the calcu-
lated input floating point position and use the resulting integer corner position
intensity value. One can also round the floating point position and take the
nearest corner intensity value. One can also calculate a bi-linear interpolated
value from the 4 nearest corner intensity values. The inverse concatenated
transformation matrix can be obtained from the forward one by determining
its inverse. But the inverse is often most easily constructed from re-ordered
inverse elementary transformations, which are simple versions of the forward
ones. In 2D:
Inverse
translation
Inverse rotation Inverse scaling
1 0 −dx cos(−α) − sin(−α) 0 1/sx 0 0
0 1 −dy sin(−α) cos(−α) 0 0 1/sy 0
0 0 1 0 0 1 0 0 1
5 3D perspective transformations
A lucky coincidence of the use of homogeneous coordinates is that not only
the extra column at the back (for translations) but also the extra row below
can serve a useful purpose.
6
When a 4×4 coordinate transformation matrix has non-zero entries in the
(4, 1), (4, 2), (4, 3) positions, the matrix acts as a perspective transformation
with convergence points in the accompanying direction. The use of a single
non-zero value at the (4, 3) position introduces a central perspective trans-
formation along the z-axis (or x3 direction) with a central vanishing point.
But also in the horizontal direction and vertical direction, (pairs of) vanish-
ing points can be introduced for x going towards +∞ or −∞ and/or y going
towards +∞ or −∞.
An example of a central perspective projection is:
1 0 0 0 x x x/(1 + V z)
0 1 0 0 y y y/(1 + V z) x/(1 + V z)
0 0 0 0 z = 0 →
→ y/(1 + V z)
0
0
0 0 V 1 1 1+Vz 1
7
(ordering of vertices; edges and surface triangles) remain the same. CG there-
fore only has to calculate a coordinate transformation for list of points. The
data matrix D (see Lay Ch. 2.7) consists of these vertices (one column per
vertex).
+ ∗
+ ∗
∗ +
∗ +