Edit Distance for Ordered Vector Sets:
A Case of Study
Juan Ramón Rico-Juan and José M. Iñesta⋆
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Alicante, E-03071 Alicante, Spain
{juanra, inesta}@dlsi.ua.es
Abstract. Digital contours in a binary image can be described as an
ordered vector set. In this paper an extension of the string edit distance is
defined for its computation between a pair of ordered sets of vectors. This
way, the differences between shapes can be computed in terms of editing
costs. In order to achieve efficency a dominant point detection algorithm
should be applied, removing redundant data before coding shapes into
vectors. This edit distance can be used in nearest neighbour classification
tasks. The advantages of this method applied to isolated handwritten
character classification are shown, compared to similar methods based
on string or tree representations of the binary image.
Topics: Dominant Points, Pattern Recognition, Structural Pattern
Recognition.
1
Introduction
The description of an object contour in a binary image as a string [1] using
Freeman codes [2] or using a tree representation structure [3,1] is widely used
in pattern recognition. For using these structures in a recognition task, the edit
distance is often used as a measure of the differences between two instances.
Both, string edit distances [4] and tree edit distances [5] are used, depending on
the data structures utilised for representing the problem data. In this paper, in
order to obtain a representation of the object contour from a binary image, an
ordered vector set is extracted, and an edit distance measure is defined between
pairs of instances of this representation. This measure is an extension of the
string edit distance, adding two new rules and changing vectors by symbols.
Freeman chain codes keep very fine details of the shapes since they code
the relations between every pair of adjacent pixels of the contours. To avoid
computation time and in order to remove irrelevant details, a dominant point
detection algorithm is needed. The goal is to reduce the features that represent a
binary image in order to remove redundant data to compute the distance faster,
keeping the final classification time low and good error rates.
⋆
Work supported by the Spanish CICYT under project TIC2003-08496-CO4 and
Generalitat Valenciana I+D+i under project GV06/166.
D.-Y. Yeung et al. (Eds.): SSPR&SPR 2006, LNCS 4109, pp. 200–207, 2006.
c Springer-Verlag Berlin Heidelberg 2006
Edit Distance for Ordered Vector Sets: A Case of Study
201
The remainder of this paper consists of four sections. In section 2, two different representations of the same binary image are extracted. In section 3, a new
distance based in ordered vector set is defined. In section 4, the results of experiments in a classification task, applying string and ordered vector set edit distances
are presented. Finally in section 5, the conclusions and future word are presented.
2
Feature Extraction from a Binary Image
The goal of the ordered vector set is to describe the contour of an object using
the least possible number of elements. The classical representation of a contour
0
7
1
6
2
3
5
4
Fig. 1. Freeman 2D code
Original
binary Image
..................................
...........................XXXXX..
..........XX..........XXXXXXXXXXX.
.........XXXXX...XXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXX.....X.
.........XXXXXXXXXXXX.............
.........XXXXXXXX.................
........XXXXXXX...................
........XXXXXX....................
........XXXXXX....................
........XXXXX.....................
........XXXXX.....................
........XXXXX.....................
........XXXX......................
.......XXXXX......................
.......XXXXX......................
.......XXXXX...............XX.....
.......XXXXX.........XXXXXXXXX....
......XXXXXX.....XXXXXXXXXXXXX....
......XXXXXX...XXXXXXXXXXXXXXX....
......XXXXXX..XXXXXXXXXXXXXXXX....
......XXXXXX..XXXXXXXXXXXXX.......
......XXXXX...XXXXXXXXXX..........
.....XXXXXX.....XXXXX.............
.....XXXXXX.......................
....XXXXXX........................
....XXXXXX........................
...XXXXXXX........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXX..........................
..XXXXXX..........................
..XXXXXX..........................
.XXXXXX...........................
.XXXXX............................
..XXXX............................
..................................
Contour
Open filter
..................................
...........................XXXXX..
..........XX..........XXXXXXXXXXX.
.........XXXXX...XXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXXXXXXXX.
.........XXXXXXXXXXXXXXXXXX.....X.
.........XXXXXXXXXXXX.............
.........XXXXXXXX.................
........XXXXXXXX..................
........XXXXXXX...................
........XXXXXX....................
........XXXXX.....................
........XXXXX.....................
........XXXXX.....................
........XXXX......................
.......XXXXX......................
.......XXXXX......................
.......XXXXX...............XX.....
.......XXXXX.........XXXXXXXXX....
......XXXXXXX....XXXXXXXXXXXXX....
......XXXXXXXX.XXXXXXXXXXXXXXX....
......XXXXXXXXXXXXXXXXXXXXXXXX....
......XXXXXXXXXXXXXXXXXXXXX.......
......XXXXXXX.XXXXXXXXXX..........
.....XXXXXXX....XXXXX.............
.....XXXXXX.......................
....XXXXXX........................
....XXXXXX........................
...XXXXXXX........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXXX.........................
...XXXXX..........................
..XXXXXX..........................
..XXXXXX..........................
.XXXXXX...........................
.XXXXX............................
..XXXX............................
..................................
Thinnnig
algorithm
....................................
...........................XXXXXX...
..........XXX.........XXXXX......X..
.........X...XX..XXXXX............X.
.........X.....XX.................X.
.........X........................X.
.........X........................X.
.........X..................XXXX..X.
.........X............XXXXXX....XX..
........X.........XXXX..............
........X........X..................
........X.......X...................
........X......X....................
........X.....X.....................
........X.....X.....................
........X.....X.....................
.......X.....X......................
.......X.....X......................
.......X.....X.............XXX......
.......X.....X.......XXXXXX...X.....
......X......X...XXXX..........X....
......X.......XXX..............X....
......X........................X....
......X........................X....
......X.....................XXX.....
.....X...................XXX........
.....X.......XXX......XXX...........
....X.......X...XXXXXX..............
....X......X........................
...X.......X........................
...X.......X........................
...X......X.........................
...X......X.........................
...X......X.........................
...X......X.........................
..X......X..........................
..X......X..........................
.X.......X..........................
.X......X...........................
.X.....X............................
.X.....X............................
..XXXXX.............................
....................................
Start
pixel
Start
pixel
............................
..........................X.
.....................XXXXX..
................XXXXX.......
..........XXXXXX............
.........X..................
........X...................
........X...................
........X...................
.......X....................
.......X....................
.......X....................
.......X....................
.......X....................
.......X....................
.......X....................
......X.....................
......X.....................
......X.....................
......X.............XXXXX...
......XXXXXXX...XXXX........
.....X.......XXX............
.....X......................
.....X......................
.....X......................
....X.......................
....X.......................
...X........................
...X........................
..X.........................
..X.........................
..X.........................
..X.........................
..X.........................
.X..........................
.X..........................
............................
F="44445676665666665
66655554454444322122
21222221223344456656
65665666667665544544
45445545666670001010
00001010100001000100
00001000001223232122
221222212222233"
Coded
chain
Start
pixel
5
0
-5
Linked
dominant
points
Ordered
vector
set
-10
-15
-20
-25
-30
|V|
angle
11.401754 -2.875341
5.000000 3.141593
2.828427 -2.356194
10.198039 -1.768192
4.123106 -1.325818
6.082763 -0.165149
7.280110 0.278300
4.000000 0.000000
9.219544 -2.922924
3.162278 2.819842
7.071068 -2.999696
14.560220 -1.849096
29.832868 1.333948
2.828427 0.785398
16.278821 0.185348
-35
-25
-20
-15
-10
-5
0
Fig. 2. General scheme. From the binary image, morphological filters are applied to
correct gaps and spurious points. Thus, contour and skeleton are obtained. From the
first, the chain code is obtained and from the second, the ordered vector set is extracted
using a dominant point selection algorithm.
202
J.R. Rico-Juan and J.M. Iñesta
in a binary image links the contour pixels with their neighbors using 0 to 7 (see
Fig. 1) codes which represent a discrete number of 2D directions. This way, a
string that represents the contour is obtained (Fig. 2 top-right).
This kind of feature extraction assumes that all linked pixels are of equal
importance. If we select the most representative points of the contour and link
all these points, a compact representation of 2D figures is obtained, with less
features than using Freeman codes.
The idea is to select a set of dominant points in a contour [6,7], link those
points following the contour of the figure using 2D vectors, and then use these
ordered vector set to represent the image (Fig. 2 bottom-right).
In a particular application of handwritten character recognition, it is recommended to apply some filter operations to original image before extracting
and coding the contours [8] including an opening filter [9] and a thinning algorithm [10] in order to remove noise and redundant information.
3
Ordered Vector Set Edit Distance
The string edit distance definition [4] is based on three edit operations: insertion,
deletion, and substitution. Let Σ the alphabet, A, B ∈ Σ ∗ two finite strings of
characters, and Λ is a null character. A i is the ith character of the string A;
A i : j is the substring form the ith to jth characters of A, both inclusive.
2
An edit operation is a pair (a, b) ∈ (Σ ∪ {Λ}) : (a, b) = (Λ, Λ). So, the basic
edit operations are substitution a → b, insertion Λ → b and deletion a → Λ. If a
generic cost function is associated to each operation γs (a → b), the cost of the
sequence of edit operations that transforms a finite string A in B is defined as
ds (A,
B) =
γs (Λ → B 1) + ds (A, B 2 : |B|)
|B| ≥ 1
γs (A 1 → Λ) + ds (A 2 : |A| , B)
|A| ≥ 1
min
(A
1
→
B
1)
+
d
(A
2
:
|A|
,
B
2
:
|B|)
|A|
≥
1 ∧ |B| ≥ 1
γ
s
s
0
|A| = 0 ∧ |B| = 0
The similar idea of an ordered string is extended to an ordered vector set.
∗
Let V, W ∈ (R×[0, 2π]) a finite set of vectors and Λ is a null vector. V i is
the vector ith in the set V , VN i is the norm and Vα i is the angle of the
ith vector; V i : j is the subset from ith to jth component vectors of V , both
included.
Now, an edit operation is a pair (v, w) ∈ (R×[0, 2π]) , (v, w) = (Λ, Λ) :
(v, w∗ ) ∪ (v ∗ , w). So, the basic edit operations are substitution (1 to 1) v → w,
substitution (1 to N ) called fragmentation v → w+ , substitution (N to 1) called
consolidation v + → w, insertion Λ → w and deletion v → Λ. Here, we have
considered the case that one vector could be replaced by N , or vice versa.
When using dominant points, it is usual that a small change in the contour
generates a new dominant point, so when comparing two prototypes 1 vector
in the first prototype can be similar to N continuous vectors from the second
prototype.
Edit Distance for Ordered Vector Sets: A Case of Study
203
The cost of sequence of edit operations that transforms a finite ordered vector
set V into W , if we establish a cost function γv (v ∗ , w∗ ), is defined as
dv (V,
W ) =
γv (Λ → W 1) + dv (V, W 2 : |W |)
|W | ≥ 1
|V | ≥ 1
γv (V 1 → Λ) + dv (V 2 : |V | , W )
γ
(V
1
→
W
1)
+
d
(V
2
:
|A|
,
W
2
:
|B|)
|V
|
≥
1 ∧ |W | ≥ 1
v
v
γv (V 1 → W 1 : j) + dv (V 2 : |V | , B j + 1 : |W |)
min
|W | > 2
j∈[2,|W |]
γv (V 1 : i → W 1) + dv (V j + 1 : |V | , B 2 : |W |)
|V | > 2
i∈[2,|V |]
0
|V | = 0 ∧ |W | = 0
In a similar way to the efficient (dynamic programming technique) algorithm
proposed in [4] for computing the string edit distance, it can be extended to
compute the ordered vector set edit distance in the following way:
1. Function vectorEditDistance(V ,W )
2.
D[0, 0] := 0;
3.
for i := 1 to |V | do D[i, 0] := D[i − 1, 0] + γv (V i → Λ);
4.
for j := 1 to |W | do D[0, j] := D[0, j − 1] + γv (Λ → W j);
5.
for i := 1 to |V | do
6.
for j := 1 to |W | do
7.
m1 := D[i − 1, j − 1] + γv (V i → W j);
8.
m2 := D[i − 1, j] + γv (V i → Λ);
9.
m3 := D[i, j − 1] + γv (Λ → W j);
10.
m := ∞;
11.
for k := 1 to |V | do
12.
if (i − k) ≥ 0 then
13.
m := min {m, D[i − k, j − 1] + γv (V i − k : i → W j)};
14.
endfor
15.
for k := 1 to |W | do
16.
if (j − k) ≥ 0 then
17.
m := min {m, D[i − 1, j − k] + γv (V i → W j − k : j)};
18.
endfor
19.
D[i, j] := min(m, m1 , m2 , m3 );
20.
endfor
21.
endfor
22. return D[i, j]
The complexity of the string edit distance algorithm is proportional to the
length of both strings, O(|A| |B|). In the case of the vectorEditDistance, it has
three nested loops and the complexity is O(|V | |W | max {|V | |W |} O(γv )), but if
we consider that a vector can be replaced by a fixed constant number of vectors
and the function γv defined bellow, the complexity is reduced to O(|V | |W |).
Thus, the cost is similar to that of the string edit distance.
204
J.R. Rico-Juan and J.M. Iñesta
To compute the difference between one vector and a set of N vectors, used in
vectorEditDistance, the following function is utilised:
1. Function γv (V k → W i : j)
2.
float auxN := 0, aunAng := 0, r := 0, rSubs := 0, rLef t := 0
3.
auxN := VN k //Norm single vector
4.
auxAng := Vα k //Angle single vector
5.
for l := i to j do
6.
if auxN ≥ 0 then //Left norm single vector
7.
rSubs := rSubs + auxN ∗ closest(auxAng, Wα l)
8.
auxAng := Wα l
9.
endif
10.
auxN := auxN − WN l
11.
endfor
12.
if auxN ≥ 0 then //Left norm single vector
13.
rLef t := auxN ∗ kInsertion
14.
else //Norms W vectors > V
15.
rLef t := −auxN ∗ kDeletion
16.
endif
17. return rSubs + rLef t
where closest(angle1, angle2) returns the smallest angle between both parameters, resulting a value in [0, π]. The kInsertion = kDeletion = π/2 is the maximum
possible difference between two angles.
The functions γv (V i.j → W k) and γv (V i → W j) are similar. In the
first case, the parameters change the order and in the second case, both parameters are unitary vectors.
The insertion and deletion functions are defined as γv (Λ → W j) = |W j|∗
kInsertion and γv (V i → Λ) = |V j| ∗ kDeletion.
4
Experiments
Three algorithms have been compared based on different contour descriptions:
1. Classical Freeman chain code extracted from the object contour in the binary
image. Any point reduction method is applied.
2. The ordered vector set extracted from the dominant points computed by the
algorithm described in [7], that will be referred as non collinear dominant
points (NCDP).
3. The new structure based in the ordered vector set extracted from dominant
points described in [6]. In this article, 1 − curvature and k − curvature algorithms are defined in order to select dominant points using these measures.
The authors showed that the obtained dominant points were similar for both
curvature measures, so we utilised the faster one: 1 − curvature.
Edit Distance for Ordered Vector Sets: A Case of Study
205
In the preliminary trials tested, the algorithm 1 − curvature obtained lower
error rates than NCDP. Thus, the k parameter in the vectorEditDistance function was tuned when applied to 1 − curvature. The k parameter is the maximum
number of continuous vectors that was set to k = 1.
A classification task using the NIST SPECIAL DATABASE 3 of the National
Institute of Standards and Technology was performed using the different contour
descriptions enumerated above to represent the characters. Only the 26 uppercase handwritten characters were used. The increasing-size training samples for
the experiments were built by taking 500 writers and selecting the samples randomly. The nearest neighbour (NN) technique was used for perform classification.
20
Vector set (1-curvature)
Average classification time(sec.)
Vector set (1-curvature)
Average error rate(%)
18
16
14
12
10
8
6
10000
1000
100
0
1
2
3
4
5
6
7
0
Number of vectors can be replaced in edit distance function
1
2
3
4
5
6
7
Number of vectors can be replaced in edit distance function
(a)
(b)
30
Contour string
Vector set (NCDP)
Vector set (1-curvature)
Average error rate(%)
25
20
15
10
5
Average classification time(sec.)
Fig. 3. Results for NN classification of characters obtained with ordered vector set
(1 − curvature), different training set (200 examples per class) and test set (50 samples
per class and 26 character classes) as a function of different number of vectors that can
be replaced in a substitution operation in the vector edit distance: (a) average error
rate ± standard deviation; (b) average classification time
Contour string
Vector set (NCDP)
Vector set (1-curvature)
10000
1000
100
50
100
150
200
Number of examples per class in training set
(a)
50
100
150
200
Number of examples per class in training set
(b)
Fig. 4. Results for NN classification of characters obtained with different contour representations as a function of different training example sizes: (a) average error rate ±
standard deviation; (b) average classification time
206
J.R. Rico-Juan and J.M. Iñesta
Figure 3 shows the comparison between the error rate in the vector classification task evaluated for different sizes, k (vectorEditDistance). This experiment
shows that the error rate decreases linearly when the k grows to a limit. If
k grows the number of computations increases as well the classification time.
In this case, we found the lowest error rate with the lowest k, so the optimal
parameter value was k = 3.
The figure 4 shows the classification error rate and the time used in the classification of 50 examples per class as a function of different training set.
In all cases the use of Freeman chain codes generates a lower error rate (less
than 9%) in recognition than using ordered vector sets, although the classification
time is much higher. Thus, the ordered vector set description based on dominant
points 1 − curvature [6] is a good trade-off choice. It obtains also a low error rate
(less than 11%) and it is 10 times faster than using the Freeman chain codes.
5
Conclusions and Future Work
The computation of the edit distance between ordered vector sets that represent
the contour of an object in a binary image (based on dominant point computation
using 1-curvature) is one order of magnitude faster than using Freeman chain
codes, and it has just a slightly higher error rate when using it for recognition.
The edit distance defined in this paper to compare ordered vector sets has similar
complexity than that of string edit distance. Since the size of the ordered vector
set is significatively lower than that of strings for representing the same object,
the time needed for computing the distance needed for classification is much
lower.
As it can be seen in the results section the error rate using ordered vector set
based on dominant points is similar to that of using the Freeman chain code.
As future work we planned to use some special labels for each vector to describe the curved shape of the original image in order to obtain a better description of the binary image contour and decrease the error rate in this classification
task. Another possible line of future work is to apply algorithms such as [11] in
order to optimise the cost functions for the ordered vector set edit distance.
References
1. Rico-Juan, J.R., Micó, L.: Comparison of AESA and LAESA search algorithms
using string and tree edit distances. Pattern Recognition Letters 24(9) (2003)
1427–1436
2. Freeman, H.: On the encoding of arbitrary geometric configurations. IRE Transactions on Electronic Computer 10 (1961) 260–268
3. Rico-Juan, J.R., Micó, L.: Some results about the use of tree/string edit distances
in a nearest neighbour classification task. In Goos, G., Hartmanis, J., van Leeuwen,
J., eds.: Pattern Recognition and Image Analysis. Number 2652 in Lecture Notes
in Computer Science, Puerto Andratx, Mallorca, Spain, Springer (2003) 821–828
4. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21
(1974) 168–173
Edit Distance for Ordered Vector Sets: A Case of Study
207
5. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between
trees and related problems. SIAM Journal of Computing 18 (1989) 1245–1262
6. Teh, C.H., Chin, R.T.: On the detection of dominant points on digital curves.
IEEE Trans. Pattern Anal. Mach. Intell. 11 (1989) 859–872
7. Iñesta, J.M., Buendía, M., Sarti, M.A.: Reliable polygonal approximations of imaged read objects though dominant point detection. Pattern Recognition 31 (1998)
685–697
8. Rico-Juan, J.R., Calera-Rubio, J.: Evaluation of handwritten character recognizers
using tree-edit-distance and fast nearest neighbour search. In Iñesta, J.M., Micó, L.,
eds.: Pattern Recognition in Information Systems, Alicante (Spain), ICEIS PRESS
(2002) 326–335
9. Serra, J.: Image Analysis and mathematical morphology. Academic Press (1982)
10. Carrasco, R.C., Forcada, M.L.: A note on the Nagendraprasad-Wang-Gupta thinning algorithm. Pattern Recognition Letters 16 (1995) 539–541
11. Ristad, E., Yianilos, P.: Learning string-edit distance. IEEE Transactions on
Pattern Analysis and Machine Intelligence 20 (1998) 522–532