Academia.eduAcademia.edu

Edit Distance for Ordered Vector Sets: A Case of Study

2006, Lecture Notes in Computer Science

Digital contours in a binary image can be described as an ordered vector set. In this paper an extension of the string edit distance is defined for its computation between a pair of ordered sets of vectors. This way, the differences between shapes can be computed in terms of editing costs. In order to achieve efficency a dominant point detection algorithm should be applied, removing redundant data before coding shapes into vectors. This edit distance can be used in nearest neighbour classification tasks. The advantages of this method applied to isolated handwritten character classification are shown, compared to similar methods based on string or tree representations of the binary image.

Edit Distance for Ordered Vector Sets: A Case of Study Juan Ramón Rico-Juan and José M. Iñesta⋆ Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante, E-03071 Alicante, Spain {juanra, inesta}@dlsi.ua.es Abstract. Digital contours in a binary image can be described as an ordered vector set. In this paper an extension of the string edit distance is defined for its computation between a pair of ordered sets of vectors. This way, the differences between shapes can be computed in terms of editing costs. In order to achieve efficency a dominant point detection algorithm should be applied, removing redundant data before coding shapes into vectors. This edit distance can be used in nearest neighbour classification tasks. The advantages of this method applied to isolated handwritten character classification are shown, compared to similar methods based on string or tree representations of the binary image. Topics: Dominant Points, Pattern Recognition, Structural Pattern Recognition. 1 Introduction The description of an object contour in a binary image as a string [1] using Freeman codes [2] or using a tree representation structure [3,1] is widely used in pattern recognition. For using these structures in a recognition task, the edit distance is often used as a measure of the differences between two instances. Both, string edit distances [4] and tree edit distances [5] are used, depending on the data structures utilised for representing the problem data. In this paper, in order to obtain a representation of the object contour from a binary image, an ordered vector set is extracted, and an edit distance measure is defined between pairs of instances of this representation. This measure is an extension of the string edit distance, adding two new rules and changing vectors by symbols. Freeman chain codes keep very fine details of the shapes since they code the relations between every pair of adjacent pixels of the contours. To avoid computation time and in order to remove irrelevant details, a dominant point detection algorithm is needed. The goal is to reduce the features that represent a binary image in order to remove redundant data to compute the distance faster, keeping the final classification time low and good error rates. ⋆ Work supported by the Spanish CICYT under project TIC2003-08496-CO4 and Generalitat Valenciana I+D+i under project GV06/166. D.-Y. Yeung et al. (Eds.): SSPR&SPR 2006, LNCS 4109, pp. 200–207, 2006. c Springer-Verlag Berlin Heidelberg 2006  Edit Distance for Ordered Vector Sets: A Case of Study 201 The remainder of this paper consists of four sections. In section 2, two different representations of the same binary image are extracted. In section 3, a new distance based in ordered vector set is defined. In section 4, the results of experiments in a classification task, applying string and ordered vector set edit distances are presented. Finally in section 5, the conclusions and future word are presented. 2 Feature Extraction from a Binary Image The goal of the ordered vector set is to describe the contour of an object using the least possible number of elements. The classical representation of a contour 0 7 1 6 2 3 5 4 Fig. 1. Freeman 2D code Original binary Image .................................. ...........................XXXXX.. ..........XX..........XXXXXXXXXXX. .........XXXXX...XXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXX.....X. .........XXXXXXXXXXXX............. .........XXXXXXXX................. ........XXXXXXX................... ........XXXXXX.................... ........XXXXXX.................... ........XXXXX..................... ........XXXXX..................... ........XXXXX..................... ........XXXX...................... .......XXXXX...................... .......XXXXX...................... .......XXXXX...............XX..... .......XXXXX.........XXXXXXXXX.... ......XXXXXX.....XXXXXXXXXXXXX.... ......XXXXXX...XXXXXXXXXXXXXXX.... ......XXXXXX..XXXXXXXXXXXXXXXX.... ......XXXXXX..XXXXXXXXXXXXX....... ......XXXXX...XXXXXXXXXX.......... .....XXXXXX.....XXXXX............. .....XXXXXX....................... ....XXXXXX........................ ....XXXXXX........................ ...XXXXXXX........................ ...XXXXXX......................... ...XXXXXX......................... ...XXXXXX......................... ...XXXXXX......................... ...XXXXX.......................... ..XXXXXX.......................... ..XXXXXX.......................... .XXXXXX........................... .XXXXX............................ ..XXXX............................ .................................. Contour Open filter .................................. ...........................XXXXX.. ..........XX..........XXXXXXXXXXX. .........XXXXX...XXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXXXXXXXX. .........XXXXXXXXXXXXXXXXXX.....X. .........XXXXXXXXXXXX............. .........XXXXXXXX................. ........XXXXXXXX.................. ........XXXXXXX................... ........XXXXXX.................... ........XXXXX..................... ........XXXXX..................... ........XXXXX..................... ........XXXX...................... .......XXXXX...................... .......XXXXX...................... .......XXXXX...............XX..... .......XXXXX.........XXXXXXXXX.... ......XXXXXXX....XXXXXXXXXXXXX.... ......XXXXXXXX.XXXXXXXXXXXXXXX.... ......XXXXXXXXXXXXXXXXXXXXXXXX.... ......XXXXXXXXXXXXXXXXXXXXX....... ......XXXXXXX.XXXXXXXXXX.......... .....XXXXXXX....XXXXX............. .....XXXXXX....................... ....XXXXXX........................ ....XXXXXX........................ ...XXXXXXX........................ ...XXXXXX......................... ...XXXXXX......................... ...XXXXXX......................... ...XXXXXX......................... ...XXXXX.......................... ..XXXXXX.......................... ..XXXXXX.......................... .XXXXXX........................... .XXXXX............................ ..XXXX............................ .................................. Thinnnig algorithm .................................... ...........................XXXXXX... ..........XXX.........XXXXX......X.. .........X...XX..XXXXX............X. .........X.....XX.................X. .........X........................X. .........X........................X. .........X..................XXXX..X. .........X............XXXXXX....XX.. ........X.........XXXX.............. ........X........X.................. ........X.......X................... ........X......X.................... ........X.....X..................... ........X.....X..................... ........X.....X..................... .......X.....X...................... .......X.....X...................... .......X.....X.............XXX...... .......X.....X.......XXXXXX...X..... ......X......X...XXXX..........X.... ......X.......XXX..............X.... ......X........................X.... ......X........................X.... ......X.....................XXX..... .....X...................XXX........ .....X.......XXX......XXX........... ....X.......X...XXXXXX.............. ....X......X........................ ...X.......X........................ ...X.......X........................ ...X......X......................... ...X......X......................... ...X......X......................... ...X......X......................... ..X......X.......................... ..X......X.......................... .X.......X.......................... .X......X........................... .X.....X............................ .X.....X............................ ..XXXXX............................. .................................... Start pixel Start pixel ............................ ..........................X. .....................XXXXX.. ................XXXXX....... ..........XXXXXX............ .........X.................. ........X................... ........X................... ........X................... .......X.................... .......X.................... .......X.................... .......X.................... .......X.................... .......X.................... .......X.................... ......X..................... ......X..................... ......X..................... ......X.............XXXXX... ......XXXXXXX...XXXX........ .....X.......XXX............ .....X...................... .....X...................... .....X...................... ....X....................... ....X....................... ...X........................ ...X........................ ..X......................... ..X......................... ..X......................... ..X......................... ..X......................... .X.......................... .X.......................... ............................ F="44445676665666665 66655554454444322122 21222221223344456656 65665666667665544544 45445545666670001010 00001010100001000100 00001000001223232122 221222212222233" Coded chain Start pixel 5 0 -5 Linked dominant points Ordered vector set -10 -15 -20 -25 -30 |V| angle 11.401754 -2.875341 5.000000 3.141593 2.828427 -2.356194 10.198039 -1.768192 4.123106 -1.325818 6.082763 -0.165149 7.280110 0.278300 4.000000 0.000000 9.219544 -2.922924 3.162278 2.819842 7.071068 -2.999696 14.560220 -1.849096 29.832868 1.333948 2.828427 0.785398 16.278821 0.185348 -35 -25 -20 -15 -10 -5 0 Fig. 2. General scheme. From the binary image, morphological filters are applied to correct gaps and spurious points. Thus, contour and skeleton are obtained. From the first, the chain code is obtained and from the second, the ordered vector set is extracted using a dominant point selection algorithm. 202 J.R. Rico-Juan and J.M. Iñesta in a binary image links the contour pixels with their neighbors using 0 to 7 (see Fig. 1) codes which represent a discrete number of 2D directions. This way, a string that represents the contour is obtained (Fig. 2 top-right). This kind of feature extraction assumes that all linked pixels are of equal importance. If we select the most representative points of the contour and link all these points, a compact representation of 2D figures is obtained, with less features than using Freeman codes. The idea is to select a set of dominant points in a contour [6,7], link those points following the contour of the figure using 2D vectors, and then use these ordered vector set to represent the image (Fig. 2 bottom-right). In a particular application of handwritten character recognition, it is recommended to apply some filter operations to original image before extracting and coding the contours [8] including an opening filter [9] and a thinning algorithm [10] in order to remove noise and redundant information. 3 Ordered Vector Set Edit Distance The string edit distance definition [4] is based on three edit operations: insertion, deletion, and substitution. Let Σ the alphabet, A, B ∈ Σ ∗ two finite strings of characters, and Λ is a null character. A i is the ith character of the string A; A i : j is the substring form the ith to jth characters of A, both inclusive. 2 An edit operation is a pair (a, b) ∈ (Σ ∪ {Λ}) : (a, b) = (Λ, Λ). So, the basic edit operations are substitution a → b, insertion Λ → b and deletion a → Λ. If a generic cost function is associated to each operation γs (a → b), the cost of the sequence of edit operations that transforms a finite string A in B is defined as ds (A, B) = γs (Λ → B 1) + ds (A, B 2 : |B|) |B| ≥ 1    γs (A 1 → Λ) + ds (A 2 : |A| , B) |A| ≥ 1 min (A 1 → B 1) + d (A 2 : |A| , B 2 : |B|) |A| ≥ 1 ∧ |B| ≥ 1 γ  s s   0 |A| = 0 ∧ |B| = 0 The similar idea of an ordered string is extended to an ordered vector set. ∗ Let V, W ∈ (R×[0, 2π]) a finite set of vectors and Λ is a null vector. V i is the vector ith in the set V , VN i is the norm and Vα i is the angle of the ith vector; V i : j is the subset from ith to jth component vectors of V , both included. Now, an edit operation is a pair (v, w) ∈ (R×[0, 2π]) , (v, w) = (Λ, Λ) : (v, w∗ ) ∪ (v ∗ , w). So, the basic edit operations are substitution (1 to 1) v → w, substitution (1 to N ) called fragmentation v → w+ , substitution (N to 1) called consolidation v + → w, insertion Λ → w and deletion v → Λ. Here, we have considered the case that one vector could be replaced by N , or vice versa. When using dominant points, it is usual that a small change in the contour generates a new dominant point, so when comparing two prototypes 1 vector in the first prototype can be similar to N continuous vectors from the second prototype. Edit Distance for Ordered Vector Sets: A Case of Study 203 The cost of sequence of edit operations that transforms a finite ordered vector set V into W , if we establish a cost function γv (v ∗ , w∗ ), is defined as dv (V, W ) = γv (Λ → W 1) + dv (V, W 2 : |W |) |W | ≥ 1     |V | ≥ 1 γv (V 1 → Λ) + dv (V 2 : |V | , W )     γ (V 1 → W 1) + d (V 2 : |A| , W 2 : |B|) |V | ≥ 1 ∧ |W | ≥ 1  v v   γv (V 1 → W 1 : j) + dv (V 2 : |V | , B j + 1 : |W |) min |W | > 2 j∈[2,|W |]     γv (V 1 : i → W 1) + dv (V j + 1 : |V | , B 2 : |W |)   |V | > 2   i∈[2,|V |]    0 |V | = 0 ∧ |W | = 0 In a similar way to the efficient (dynamic programming technique) algorithm proposed in [4] for computing the string edit distance, it can be extended to compute the ordered vector set edit distance in the following way: 1. Function vectorEditDistance(V ,W ) 2. D[0, 0] := 0; 3. for i := 1 to |V | do D[i, 0] := D[i − 1, 0] + γv (V i → Λ); 4. for j := 1 to |W | do D[0, j] := D[0, j − 1] + γv (Λ → W j); 5. for i := 1 to |V | do 6. for j := 1 to |W | do 7. m1 := D[i − 1, j − 1] + γv (V i → W j); 8. m2 := D[i − 1, j] + γv (V i → Λ); 9. m3 := D[i, j − 1] + γv (Λ → W j); 10. m := ∞; 11. for k := 1 to |V | do 12. if (i − k) ≥ 0 then 13. m := min {m, D[i − k, j − 1] + γv (V i − k : i → W j)}; 14. endfor 15. for k := 1 to |W | do 16. if (j − k) ≥ 0 then 17. m := min {m, D[i − 1, j − k] + γv (V i → W j − k : j)}; 18. endfor 19. D[i, j] := min(m, m1 , m2 , m3 ); 20. endfor 21. endfor 22. return D[i, j] The complexity of the string edit distance algorithm is proportional to the length of both strings, O(|A| |B|). In the case of the vectorEditDistance, it has three nested loops and the complexity is O(|V | |W | max {|V | |W |} O(γv )), but if we consider that a vector can be replaced by a fixed constant number of vectors and the function γv defined bellow, the complexity is reduced to O(|V | |W |). Thus, the cost is similar to that of the string edit distance. 204 J.R. Rico-Juan and J.M. Iñesta To compute the difference between one vector and a set of N vectors, used in vectorEditDistance, the following function is utilised: 1. Function γv (V k → W i : j) 2. float auxN := 0, aunAng := 0, r := 0, rSubs := 0, rLef t := 0 3. auxN := VN k //Norm single vector 4. auxAng := Vα k //Angle single vector 5. for l := i to j do 6. if auxN ≥ 0 then //Left norm single vector 7. rSubs := rSubs + auxN ∗ closest(auxAng, Wα l) 8. auxAng := Wα l 9. endif 10. auxN := auxN − WN l 11. endfor 12. if auxN ≥ 0 then //Left norm single vector 13. rLef t := auxN ∗ kInsertion 14. else //Norms W vectors > V 15. rLef t := −auxN ∗ kDeletion 16. endif 17. return rSubs + rLef t where closest(angle1, angle2) returns the smallest angle between both parameters, resulting a value in [0, π]. The kInsertion = kDeletion = π/2 is the maximum possible difference between two angles. The functions γv (V i.j → W k) and γv (V i → W j) are similar. In the first case, the parameters change the order and in the second case, both parameters are unitary vectors. The insertion and deletion functions are defined as γv (Λ → W j) = |W j|∗ kInsertion and γv (V i → Λ) = |V j| ∗ kDeletion. 4 Experiments Three algorithms have been compared based on different contour descriptions: 1. Classical Freeman chain code extracted from the object contour in the binary image. Any point reduction method is applied. 2. The ordered vector set extracted from the dominant points computed by the algorithm described in [7], that will be referred as non collinear dominant points (NCDP). 3. The new structure based in the ordered vector set extracted from dominant points described in [6]. In this article, 1 − curvature and k − curvature algorithms are defined in order to select dominant points using these measures. The authors showed that the obtained dominant points were similar for both curvature measures, so we utilised the faster one: 1 − curvature. Edit Distance for Ordered Vector Sets: A Case of Study 205 In the preliminary trials tested, the algorithm 1 − curvature obtained lower error rates than NCDP. Thus, the k parameter in the vectorEditDistance function was tuned when applied to 1 − curvature. The k parameter is the maximum number of continuous vectors that was set to k = 1. A classification task using the NIST SPECIAL DATABASE 3 of the National Institute of Standards and Technology was performed using the different contour descriptions enumerated above to represent the characters. Only the 26 uppercase handwritten characters were used. The increasing-size training samples for the experiments were built by taking 500 writers and selecting the samples randomly. The nearest neighbour (NN) technique was used for perform classification. 20 Vector set (1-curvature) Average classification time(sec.) Vector set (1-curvature) Average error rate(%) 18 16 14 12 10 8 6 10000 1000 100 0 1 2 3 4 5 6 7 0 Number of vectors can be replaced in edit distance function 1 2 3 4 5 6 7 Number of vectors can be replaced in edit distance function (a) (b) 30 Contour string Vector set (NCDP) Vector set (1-curvature) Average error rate(%) 25 20 15 10 5 Average classification time(sec.) Fig. 3. Results for NN classification of characters obtained with ordered vector set (1 − curvature), different training set (200 examples per class) and test set (50 samples per class and 26 character classes) as a function of different number of vectors that can be replaced in a substitution operation in the vector edit distance: (a) average error rate ± standard deviation; (b) average classification time Contour string Vector set (NCDP) Vector set (1-curvature) 10000 1000 100 50 100 150 200 Number of examples per class in training set (a) 50 100 150 200 Number of examples per class in training set (b) Fig. 4. Results for NN classification of characters obtained with different contour representations as a function of different training example sizes: (a) average error rate ± standard deviation; (b) average classification time 206 J.R. Rico-Juan and J.M. Iñesta Figure 3 shows the comparison between the error rate in the vector classification task evaluated for different sizes, k (vectorEditDistance). This experiment shows that the error rate decreases linearly when the k grows to a limit. If k grows the number of computations increases as well the classification time. In this case, we found the lowest error rate with the lowest k, so the optimal parameter value was k = 3. The figure 4 shows the classification error rate and the time used in the classification of 50 examples per class as a function of different training set. In all cases the use of Freeman chain codes generates a lower error rate (less than 9%) in recognition than using ordered vector sets, although the classification time is much higher. Thus, the ordered vector set description based on dominant points 1 − curvature [6] is a good trade-off choice. It obtains also a low error rate (less than 11%) and it is 10 times faster than using the Freeman chain codes. 5 Conclusions and Future Work The computation of the edit distance between ordered vector sets that represent the contour of an object in a binary image (based on dominant point computation using 1-curvature) is one order of magnitude faster than using Freeman chain codes, and it has just a slightly higher error rate when using it for recognition. The edit distance defined in this paper to compare ordered vector sets has similar complexity than that of string edit distance. Since the size of the ordered vector set is significatively lower than that of strings for representing the same object, the time needed for computing the distance needed for classification is much lower. As it can be seen in the results section the error rate using ordered vector set based on dominant points is similar to that of using the Freeman chain code. As future work we planned to use some special labels for each vector to describe the curved shape of the original image in order to obtain a better description of the binary image contour and decrease the error rate in this classification task. Another possible line of future work is to apply algorithms such as [11] in order to optimise the cost functions for the ordered vector set edit distance. References 1. Rico-Juan, J.R., Micó, L.: Comparison of AESA and LAESA search algorithms using string and tree edit distances. Pattern Recognition Letters 24(9) (2003) 1427–1436 2. Freeman, H.: On the encoding of arbitrary geometric configurations. IRE Transactions on Electronic Computer 10 (1961) 260–268 3. Rico-Juan, J.R., Micó, L.: Some results about the use of tree/string edit distances in a nearest neighbour classification task. In Goos, G., Hartmanis, J., van Leeuwen, J., eds.: Pattern Recognition and Image Analysis. Number 2652 in Lecture Notes in Computer Science, Puerto Andratx, Mallorca, Spain, Springer (2003) 821–828 4. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21 (1974) 168–173 Edit Distance for Ordered Vector Sets: A Case of Study 207 5. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing 18 (1989) 1245–1262 6. Teh, C.H., Chin, R.T.: On the detection of dominant points on digital curves. IEEE Trans. Pattern Anal. Mach. Intell. 11 (1989) 859–872 7. Iñesta, J.M., Buendía, M., Sarti, M.A.: Reliable polygonal approximations of imaged read objects though dominant point detection. Pattern Recognition 31 (1998) 685–697 8. Rico-Juan, J.R., Calera-Rubio, J.: Evaluation of handwritten character recognizers using tree-edit-distance and fast nearest neighbour search. In Iñesta, J.M., Micó, L., eds.: Pattern Recognition in Information Systems, Alicante (Spain), ICEIS PRESS (2002) 326–335 9. Serra, J.: Image Analysis and mathematical morphology. Academic Press (1982) 10. Carrasco, R.C., Forcada, M.L.: A note on the Nagendraprasad-Wang-Gupta thinning algorithm. Pattern Recognition Letters 16 (1995) 539–541 11. Ristad, E., Yianilos, P.: Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) 522–532