Position, Rotation, and SC Ale-Invariant Recognition OF 2-Dimensional Objects Using A Gradient Coding Scheme

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

EEE Pacific Rim Conferenceon Communications, Computers and Signal Processing

June 1st - 2nd. 1989


POSITION, ROTATION, AND SCALE-INVARIANT RECOGNITION OF
2-DIMENSIONAL OBJECTS USING A GRADIENT CODING SCHEME

R. Safaee-Rad*, B. Benhabib*, K.C. Smith*+, K.M. Ty'


Computer Integrated Manufacturing Laboratory"
*Department of Mechanical Engineering
+Department of ElectricalEngineering
University of Toronto
Toronto, Ontario, M5S 1A4

Abstract

The 2D representation and recognition of objects, The problem of 2D shape recognition is one of the
based on external space-domain descriptors, is addressed most familiar and fundamental problems in pattern recogni-
in this paper. A new shape signature is proposed for 2D tion. Comprehensive surveys of algorithms for description
shape recognition, namely the gradient-perimeter plot. The and analysis of 2D shapes have been reported in [4,5]. Ac-
recognition method is , position-, rotation-, and scale- cording to these surveys, 2D shape descriptors can be di-
invariance. For the matching process a cost function, based vided into four classes, two of which are based on the ex-
on mean square error (MSE), is minimized. It is also shown ternal (boundary) properties of the shape (external scalar-
that, as the basis of an alternative method, the angle- transform descriptors and external space-domain descrip-
perimeter plot can be used to provide a feature set for 2D tors). Boundary is the basis of these two 2D shape analysis
shapes. This method can be applied to polygonal shapes, techniques, since it has been shown that the information
or shapes for which polygonal approximations are content of a shape is concentrated along the boundary
available. Both methods were successfully applied to a set contours, and furthermore, at points (on those contours) at
of standard shapes. which the direction changes most rapidly [6].
One of the 2D representation and recognition tech-
1. Introduction niques under study for integration in the proposed vision
system, is based on the boundaries of shapes (an external
The flexibility of robotic workcells can be significantly space-domai n descriptor). Boundary representation
increased through the integration of visual sensors for 3D schemes generally express the curvature in some manner
object recognition; thus, reducing the need for customized as a function of linear or angular variation along a blob's
and often complex tooling by implementing flexible au- periphery. These functions can be referred to as signatures
tomation rather than hard automation. The research pro- of planar shapes. In general, a signature is a function that
gram currently undetway at the University of Toronto fo- can be associated with any continuous planar curve, and its
cuses on developing an efficient and flexible 3D object- definition combines local variations in the curve's tangent
recognition method for vision-based robotic assembly. The with the global layout of the length on the plane.
main incentives in pursuing this research are the lack of
generalized computer vision theories applicable to robotic Various 2D shape signatures (with different names)
vision, and the inefficiency and inflexibility of existing 3D have been proposed and used for pattern analysis in the
object-recognition methods for industrial environments. machi ne-vision literature, such as: " K-curvature" [7],
Volumetric methods based on the exact specification "incremental curvature" (based on a line-segment scan
of an object, and multi-view feature representation based method) [a], "angle versus length signature" [9], "polar cod-
either on the characteristic views of an object or on the dis- ing of an object" [IO]. "plane curve signature" [ll], "the cur-
crete view-sphere representation have been developed vature K(s) of a contour [12], "normal contour distance
[1,2]. However, these techniques are complex and compu- (NCD) signature" (131, and, "slope density function" [14]. In
tationally expensive due to their need for a 3D matching this paper, a new boundary based 2D shape descriptor
process. The new 3D object-recognition technique under (signature) is defined (under the constraints of position-,
development is based on a pre-marking scheme, through rotation-, and scale-invariance) and applied to a set of
which the matching process is performed in 2 0 space [3]. standard 2D shapes. The boundary is traced and chain-
coded using the Freeman chain code [15].
In the proposed 3 0 object-recognition technique, an
object is modeled using only a small number of 2D distinct 2. Gradient-Perimeter Function
perspective views. The number of views depends partly on
the degree of symmetry of the object. These are referred to A number of preprocessing steps must be carried out
as standard-views, each with a corresponding standard- before the boundary of an object silhouette can be chain-
view-axis. For successful recognition, the input image of an coded. These include the thresholding of a "good" con-
'

object must be one of its standard perspective views. Thus, trasted image, followed by its edge detection using, for
a mobile camera is used, such that the camera's optical example, the Sobel operator. The starting point of the edge
axis can be aligned with one of the standard-view-axes of is recorded and the edge chain coded. The gradient map
the object in order to acquire a standard-view. The used for chain coding is shown in Figure 1. Since the
matching process is, then, performed between the acquired Freeman gradient map can cause a sudden jump from 0 up
2D standard-view of the object under consideration and the to 7, or 7 down to 0, when the gradient is plotted against the
library of 2D standard-views of a set of objects. Based on distance along the contour, forming a gradient-perimeter
the proposed method, then, any 2D representation and its plot, a modified code is used. The next vector code, Ci+l, in
corresponding recognition technique that is used, must be the modified chain code is defined as,
Dosition-. rotation-, and scale-invariance.
CH2691-4/89-0000.306 $1.00 0 1989 IEEE
Ci+,=Ci - (Ui - U&, + 8K) for i > 0

where, K = 0 if Ui+, < octmod (Ui + 3)


,, > octmod (Ui + 3)
K = 1 if U where 1 Ci I is the link length of the gradient code Ci (which
can be either 1 or square root of 2), and 0 is the angle mea-
Ci = Ui for i=l.
sured from the positive x-axis counterclockwise.
Then, if P is an odd number, and M = (P-l)/2, the chain
Ui is the original Freeman code. The function octmod (S) is Vectors Ci-M Ci-M+I,..., ci, ..., C~+M
1 would be examined. The
the absolute value of S, modulo 8. Applying equation (1) to total projection on the x-axis, Xproj, is the sum of the P
the Freeman code sequence "2, 1, 0, 7, 0, 6", for example, individual projections on the x-axis,
would yield the modified code sequence "2, 1, 0, -1, 0, -2".
Thus, the difference between two consecutive chain vectors
is always less than 4. (4)

A similar expression can be defined for Yproj,

The direction of the line can be determined from the angle


Bproj formed by the resultant "sum" vector and the positive x-
axis, where eproj is equal to tan-'(Yproj /Xproj). The vector
code of the current link after averaging, Ci, is then given by,

- 'proj
C i = 8 - 7 +8n.
-
Figure 1. The gradient map. 4

Using this modified gradient coding scheme, if the The value of n (..., -2, -1, 0, 1, 2, ...) is chosen such that the
boundary of the object silhouette is traced from left to right
and top to bottom, a convex boundary would have a non- absolute value of (Ei-Ci) would be less than 4. It should be
increasing gradient-perimeter function. A polygonal noted that, in general, Ci does not have an integer value.
approximation of a boundary with none of its angles greater
than 180 degrees is defined as convex boundary, whereas With an edge tracing algorithm, the modified edge-
a boundary with any of its angles greater than 180 degrees gradient-coding method, and the above gradient-code-
is defined as a (partially) concave boundary. smoothing process, the function of the edge gradient versus
Since only 8 discrete directions can be obtained from the distance along the contour, namely the gradient-
the boundary of the original object silhouette, the resolution perimeter function, can be plotted. The distance is defined
of the direction vector may be seen to be too low for a good in terms of the number of links and link lengths, (either 1 or
estimate of the slope at a point on the boundary of the square root of 2), in the Freeman chain code, and
object silhouette To increase the resolution, one can measured from the starting point of the chain to the point of
increase the number of direction vectors in the map of interest.
Figure 1, but this requires the examination of more than the In particular, provided that the edge of the object's
nearest 8 grid neighbors of the current point on the silhouette is traced from left to right, and from top to bottom,
boundary. Moreover, the complexity of the algorithm would the gradient-perimeter plot of a circle is a straight line with a
increase greatly if more neighboring pixels were examined. constant negative slope. That for r-sided convex polygon is
In order to retain low complexity and improve the res- a descending stare case having r+l horizontal segments
olution, an averaging process, which uses a set of P separated by sudden descending jumps. If the boundary
adjacent elements of the contour chain, is proposed . tracing starts from one of the vertices, there are only r
Based on the experiments carried out on a set of standard horizontal segments.
2D shapes, P was selected to be 15. The averaging Figure 2 and Figure 3 show the gradient-perimeter
procedure for the gradient vectors is important, since it plots of a rectangle and a triangle, respectively. In these
eliminates the discrete jumps in the values of the gradient figures, it can be seen that the plots have 4 and 3
vectors and provides a better estimation of the true descending jumps corresponding to 4 and 3 straight edges
gradient. The criterion for choosing the value of P was of these two shapes, respectively.
derived from the view that: "P should not be too large, so
that important shape information such as corners would be 3. Matching of Gradient-Perimeter Functions
smoothed out; however, it should also not be too small, so
that it causes poor estimation of the gradient ". The.gradient-perimeter function described in section 2
The proposed averaging process is as follows: Let Ci is an effective descriptor for shape recognition. A matching
be the gradient code, (direction code of which there are 8 ) , process based on the gradient-perimeter function is as
follows: Given the gradient-perimeter functions of an
of the current link, and, Cr and Cy represent the lengths of unknown shape and a reference shape , the first step in the
its projection on the x- and y-axes respectively, then, scale-invariant recognition process would be to normalize
the two perimeter lengths, such that, they have the same
307
invariant aspect of the algorithm. If a shape is rotated, and
the gradient-perimeter function starts at the same point on
the contour, there is a d.c. bias between the original shape
and the rotated shape gradient-perimeter functions. Thus,
the d.c. bias removal is important in the rotation-invariant
recognition process.

Minimization of the cost function provides the value for


the parameter q. This value contains the information about
the mapping of the starting point of the gradient-perimeter
function of the unknown shape to the starting point of the
reference gradient-perimeterfunction. As a result, the angle
of rotation of the unknown relative to the reference shapes
can be determined. Once the minimum and maximum
values of the cost function have been determined for the
unknown shape and a reference shape, this procedure
must be repeated for the next and all the remaining
reference shapes. The reference shape that yields the
minimum cost function is considered as a possible match.
I I I I I
0 200 400 600 800 In general, a good match is achieved if the total error
power is approximately equal to the d.c. power. The
maximum value of MSE, on the other hand, can indicate the
Figure 2. The gradient-perimeterplot of a rectangle. degree of geometrical orthogonality of the reference and
unknown shapes.
The relative cost function for two similar rectangles is
plotted in Figure 4. The two local minimum values are
achieved at q=83 and q=332, when the normalized
perimeter length is N=500 links. These values suggest that
the shape has an axis of symmetry.

4. Angle Measurement Using the Chain Code

It has been shown that most of the information


concerning the shape of a contour is carried by the
curvature extrema. One reason for this phenomenon is that
these extrema divide the outline into its major parts, through
which the identity of the shape can be determined [6,16,17].
This is the main incentive for the detection of the corners of
a 2 0 shape (based on its polygonal approximation), and
the estimation of its angles. With the modified chain code
proposed in this paper, it is possible to detect the corners,
to determine the relationships between them, and estimate
their angles. Based on these local information, it is possible
I
I I I I I to develop an alternative method for shape recognition.
0 200 400 600 800

Figure 3. The gradient-perimeter plot of a triangle.


(total) number of links, N. The next step is to compare
mutually shifted versions of the gradient-perimeterfunctions
of the unknown and a reference shapes. The minimum
mean square error (MSE) [la] between the two mutually
shifted gradient-perimeter functions is used to decide
whether the unknown shape and the reference shape
match.
Let f(p) and g(p) be the two normalized gradient-
perimeter functions, and the following be the cost function,
Tq9

[v
2
[ f(P) - g(p+q)I2
Tq = E
P-0
LI
f(p) - g(p+q)l ,-~

I
where q is the amount of shift along the perimeter. Note I I I I I I
that, both f(p) and g(p+q) are periodic functions with a 0 100 200 300 400 500
period N, due to being the gradient-perimeter functions of
closed contours. The first term (MSE) of the cost function Figure 4. The relative cost function (MSE term) for two
can be considered as the total error power, and the second similar rectangles.
term as the total d.c. power, which accounts for the rotation-
308
Let cinand coutbe the average directions of the in-
coming and outgoing chain vectors, respectively, at the
current point i on the perimeter. The direction -din is a
function of {Ci-K, Ci-K+1, ..., Ci-l}, and cout
is a function of
{Ci+l, Ci+2, ..., C~+K).
The parameter K specifies the start and
the end points of the averaging window for the incoming
and outgoing vectors. Based on the results of experiments
carried out on a set of standard 2D shapes, K was selected
to be 9. Applying the method described in section 2, the
average incoming and outgoing angles, (@in,eout), can be
estimated, where the angle at the current point i can then
be defined as,

A graph of the angles measured along a contour I I


versus the distance along a contour, called an angle- 0 500
perimeter plot, is a useful presentation. Figure 5 shows an
angle-perimeter plot. A vertex exists where either a local Figure 6. The angle-perimeter plot of a triangle.
minimum or a local maximum occurs in the angle-perimeter
plot. Note that, the angle-perimeter plot in Figure 5 has four
local minima, and consequently corresponds to a convex Since the position of a vertex can be computed from
shape. Since each of the minima is approximately 90 the coordinates of the starting point and the sequence of
degrees, it can be concluded that the 2D shape (which is the chain vectors, the position of each vertex can be ex-
that of a rectangle) has four vertices with four approximately pressed in x- and y-coordinates. From this data, the
9O-degree angles. distance between any two vertices can be computed.
Figure 6 shows the angle-perimeter plot for a triangle. Based on this information, the relationships between the
The plot has 3 minima corresponding to three angles, all vertices and between lengths of the polygon sides can be
less than 180 degrees, the sum of which is approximately determined.
equal to 165 degrees. The 15 degrees error, introduced by
the application of the proposed method of angle measure- 5. Results and Discussion
ment, is less than 10%.
The fluctuations seem to occur around the 180-degree For the recognition of an unknown 2D shape, its
value can be considered as noise. To detect a vertex, two gradient-perimeter function must be compared with the
threshold limits were set. A local minimum or a local maxi- gradient-perimeter functions of all the reference shapes.
mum must be smaller than 160 degrees or greater than 200 The minimum difference value between these functions
degrees, respectively, to be considered as a vertex. Thus would yield the matching shape. In addition, the relative
using this method, a polygon can be described in terms of angle of rotation can also be determined using the second
the total number of vertices, with the number of angles term of the cost function, once the shape is recognized.
smaller than 180 degrees indicating local convexity, and This method was applied to a set of 2D standard
with the number of angles greater than 180 degrees shapes, where the results are given in Table 1. In this table,
indicating local concavity. the "reference shapes" refer to known shapes, each with a
specific orientation; whereas, the. "test shapes" refer to
unknown shapes, with unknown orientations. The minimum
I
MSE and maximum MSE values are evaluated by shifting
the normalized gradient-perimeter function for the test
shape against the reference shape. These two values, as
well as their ratio are given in Table 1.
The first 9 entries in Table 1 correspond to cases
where the reference and test shapes are the same. The re-
maining entries have different reference and test shapes.
150- "Ellipse. n" has the ratio (major-axis-length/mi nor-axis-
length) equal to n. In general, minimum MSE values would
be close to zero only when the test and the reference
shapes are identical (as expected), or when the two shapes
are quite similar. This is especially true in the case of el-
lipses with n values larger than 4, due to the fact that their
boundaries are not well represented on the rectangular
sampling grid.
100. For a reliable match, the ratio (minimum MSE/
maximum MSE) must also be close to zero. Two exceptions
to this rule are: ellipses with n values larger than 4 , and
identical circles. The reason for obtaining a comparatively
I
0
I
200
I
400
I
600
I
800
large MSE ratio value for identical circles, is due to the
universal symmetry of circles. Based on these results, it can
be concluded that the minimum MSE is a good feature for
Figure 5. The angle-perimeter plot of a rectangle. recognition purposes, especially for well-resolved images.

309
=
Refaence Test Maximum Ratlo of 6. Conclusions
Min. MSE U)
object object MSE MSE Max MSE In this paper, a new signature is proposed for 2D
cam cam 0.0065 0.3704 0.0175 shape recognition, namely the gradient-perimeter plot. The
Circle circle 0.0012 0.0168 0.0731 method is developed under the constraint of being position-
ellipse2 ellipse2 0.0061 0.7983 o.mi6 , rotation-, and scale-invariance. For the matching process
ellipse.4 ellipse.4 0.0142 1.8787 0.0075 a cost function which has the property of rotation-invariance
ellips.8 ellipse.8 0.0088 2.7095 0.0032 is defined. The second term of the proposed cost function
ellipse.20 ellipsea 0.0184 3.2851 0.0056 accounts for the rotation-invariant aspect of the method.
WlYP PolY8On 0.0064 5.0938 0.0013 The scale-invariance of the method is achieved by
rectansle rectangle 0.0022 1.2735 0.0017 normalizing the length of the shape's perimeter length.
triangle triangle 0.0238 1.6463 0.0145
cam circle 0.1070 0.1 140 0.9387 The method has been applied to a set of standard 2D
cam rectangle 0.2002 0.7680 0.2607 shapes. The experimental results were consistent and sat-
cam triangle 0.4738 0.7821 0.6058 isfactory.
Circle ellipse.2 0.1806 0.2077 0.8694 Four sources of error have been identified. The
ehpse.2 eUise.4 0.0784 1.2423 0.0631 effectiveness of the method presented above can be
eUipse.4 eUipse.8 0.0167 2.3260 0.0072 increased, if errors or their effects can reduced.
ellipse.8 circle 0.8756 0.9619 0.9102 As the basis of an alternative method, the angle-
ellip9e.8 ellipse.20 0.0036 3.0309 0.0012 perimeter plot can also be used to provide a feature set for
DOlygOn ellipse.4 1.6471 3.7464 0.43% 2D shapes. This method can be applied to polygonal

-POlYgOn
rectangle
rectangle
triangle
1.4946
0.6848
3.0354
1.0749
0.4924
0.6371
shapes or shapes for which polygonal approximations are
available. The accuracy of the measured features is highly
crucial in the recognition process, in general, and the
above-mentioned error sources have to be addressed in
Table 1. Results of matching algorithm.
this second method as well.

For polygons and general 2D shapes for which polyg- References


onal approximations are generated, another effective
method for shape description would be to create a feature Basel, P.J., and Jane, R.C., "Three-Dimensional Ob-
vector incorporating angles of vertices and lengths of sides. ject Recognition", Computer Surveys, Vol. 17, No. 1,
This information is easily extracted from the chain code as pp. 75-145, March 1985.
described in section 4. The angles, the lengths of the sides, Horn, B.K.P., Robot Vision, MIT Press, 1986.
the relationships between the angles, and the relationships Safaee-Rad, R., Benhabib, B., Smith, K.C., and Zhou,
between the sides of a polygon, provide sufficient Z., "Pre-Marking Methods for 3D Object Recognition",
information on its shape, and thus for its recognition, being IEEE, Int. Conf. on Systems, Man, and Cybernetics,
invariant to position and orientation. Nov. 1989. (Submitted)
As well, scale-invariance can be achieved by Pavlidis, T., "Survey: A Review of Algorithms for
comparing the ratios of the side lengths. The match can be Shape Analysis", Computer Graphics and Image Pro-
obtained by determining the nearest neighbor between the cessing, Vol. 7, pp. 243-258, 1978.
test-shape feature vector and the reference-shape feature Pavlidis, T., "Algorithms for Shape Analysis of Con-
vectors in the feature space (using the nearest neighbor tours and Waveforms", IEEE, Trans. on Pattern Analy-
technique). The matching process of a test shape with a sis and Machine Intelligence, Vol. PAMI-2, No. 4, pp.
reference shape requires the number of vertices of both the 301-312, July 1980.
test and the reference shapes to be equal. Thus, it would be Attneave, F., "Some Informational Aspects of Visual
most logical that the recognition process be made Perception", Psychol. Rev., Vol. 61, pp. 183-193,
hierarchical. 1954.
The basic problem that all boundary-based Geisler, W., "A Vision System for Shape and Position
recognition techniques face, is variability or "noise". As Recognition of Industrial Parts", Proc., 2nd Int. Conf.
noted earlier, there are various levels of errors inherent in on Robot Vision and Sensory Controls, pp. 253-262,
the results obtained: for example, there is an uncertainty of F.R. of Germany, Nov. 1982.
up to 20 degrees in the angle-measurement process; for Freeman, H., "Use of Incremental Curvature for De-
ellipses with axes ratios larger than 4, the process leads to scribing and Analyzing Two-Dimensional Shape",
minimum MSE values which are lower than expected, IEEE, Proc., Conf. on Pattern Recognition and Image
leading to probable misclassification. Processing, pp. 437-444, Chicago, Aug. 1979.
In general, various factors may contribute to the Yang, H.S., and Sengupta, S., "Intelligent Shape
measurement errors of local curvature of the boundary, and Recognition for Complex Industrial Tasks", IEEE.
also to the measurement errors of the distance along the Control Svstems Magazine, pp. 23-30, June 1988.
contour. There are four main factors that need investigation
[IO] Nehr, G.,-and Martifi, P., "The Coupling of a Work-
and improvement in order to increase the accuracy of the piece Recognition System With an Industrial Robot",
measured values: the edge-detection technique and Robot Vision, Alan Pugh, pp. 83-96, 1983.
thresholding method, the edge tracing algorithm [19], the
[Ill O'Rourke, J., "The Signature of a Plane", SIAM, J.
size selection of the coding matrix [20], and, the smoothing Comput.,Vol. 15, No. 1 , pp. 34-51, Feb. 1986.
of the chain-code sequence (with the selection of a rational (121 Dessimoz, J.D., " Visual Identificationand Location in
method for choosing a value for P rather than an empirical a Multi-Object Environment by Contour Tracking and
one - see section 2). Some of these aspects have been Curvature Description", Proc., 8th Inr. Symp. on In-
addressed in the literature, but their relevance to the dustrial Robots, pp.764-777, F.R. of Germany, May-
approach presented here must be further pursued. June 1978.
[13] Vernon, D., "Two-Dimensional Object Recognition
Using Partial Contours", Image and Vision Comput-

310
ing, Vol. 5,No. 1, pp. 21-27,Feb. 1987.
[14] Ballard, D.H., and Brown C.M., Computer Vision,
Prentice-Hall, 1982.
[15] Freeman, H., " Computer Processing of Line-Drawing
Images", Computing Surveys, Vol. 6,No.1, pp. 57-97,
March 1974.
[l 61 Biederman, I., "Human Image Understanding: Recent
Research and a Theory", Comput. Vis. Graphics Im-
age Process., Vol. 32,pp. 29-73,1985.
[17] Resnikoff, H., The Illusion of Reality: Topics in
lnformation Science, Springer-Verlag, New York,
1985.
[la] Oppenheim, A., and Schafer, R., Digital Signal
Processing, Prentice-Hall, Englewood Cliffs, New
Jersey, 1975.
[19]Bennett, J.R., and Mac Donald, J.S., "On the Mea-
surement of Curvature in a Quantized Environment",
IEEE, Trans. on Computers, Vol. C-24, No. 8,pp. 803-
820,Aug. 1975.
[20]Freeman, H., and Saghri, A., "Generalized Chain
Codes for Planar Curves", Proc., 4th International Joint
Conference on Pattern Recognition, Kyoto, Japan, pp.
701 -703,NOV.1978.

Acknowledgement

This research was supported in part by the


Manufacturing Research Corporation of Ontario (MRCO).

311

You might also like