A Neural Network Approach For Learning Image Similarity in Adaptive Cbir
A Neural Network Approach For Learning Image Similarity in Adaptive Cbir
A Neural Network Approach For Learning Image Similarity in Adaptive Cbir
P. Muneesawang
School. of Elect. and Info. Engin. University of Sydney Sydney, Australia
L. G u a n
Dept. of Elect. and Comp. Engin Ryerson Polytechnic University Toronto, Canada
Abstract In this paper the adoption of neural network techniques is studied for the purpose of image retrieval. More specifically, we propose a n adaptive retrieval system which incorporates learning capability into the image retrieval module where the network weights represent the adaptivity. This system can learn users notions of similarity between images through the continual relevance feedback from the users. Accordingly it makes the proper adjustment to improve performance. This retrieval system has demonstrated its effectiveness in performance. It is conflrmed by simulations conducted for applications such as texture retrieval and retrieval of the DCT compressed images.
INTRODUCTION
In this paper the adoption of a machine-learning approach is proposed for image matching in content-based image retrieval (CBIR). This learning approach has one main advantage over the traditional retrieval approaches: it allows the retrieval system to solve the problem of fuzzy understanding of users goals that are not realized by a one-shot retrieval procedure. Specifically, we for this task in view of propose the adoption of neural network techniques [1][2] their learning capability and their ability to simulate universal mapping. These techniques allow the users to directly modify the query characteristics by specifying their desired image attributes in the form of training examples. Within this framework, the retrieval system can adjust its strategy to progressively model the notion of image similarity through continual relevance feedback from the user. In fact it uses individual user choices to decide relevance. We have adopted a Self-organizing Tree Map (SOTM) [l]and a Learning Vector Quantization (LVQ) [3] to generate prototypes in the form of local data clustering. This is used for approximating the distribution of the relevant s a ples which are associated with the training data. The resulting prototypes are then incorporated into a single-pass radial basis function network (RBFN) for ranking of an image database. In the experiments, the proposed interactive system has been applied to a compressed domain image retrieval as well as to the retrieval system that use image features extrated from uncompressed images. The comparison with other methods is provided.
257
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.
(ii). Similarity matching. Randomly select a new data point x , and find the best-matching (winning) neuron at time step t by using the minimumdistance Euclidean criterion:
wj* = argmin Ilx(t) - wjll,
3
j = 1 , 2 , . . . ,1
(1)
(iii). Updating. If ( ( x ( t) wj. (1 5 H ( t ) , where H ( t ) is the hierarchy function used to control the levels of the tree, then assign x ( t )to the j-th cluster, and adjust the synaptic weight vector according t o the reinforced learning rule: (2) Wj(t 1) = W j ( t ) a(t)[x(t) - Wj(t)], where a ( t )is the learning rate, which decreases monotonically with time, 0 < a(t)< 1. Else form a new subnode starting with x .
258
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.
(iv). Continuation. Continue with step (ii) until no noticeable changes in the feature map are observed. The SOTM algorithm obtains a new set of cluster centers VO= { V I , .. .,V k , . . . ,V K } , where the number of centers I< is controlled by the function H ( t ) . In the experiment, H ( t ) was initialized by the norm of the training vectors in X + , and was reduced linearly. (2). Cluster Modification. At this stage, the negative samples in Xare used to tune the decision boundaries of the initial set of prototypes VO. We employ the antireinforced learning rule in the LVQ algorithm [3] to perform this operation. The algorithm starts with randomly choosing input vector xA(t) from the training set { x A } g Z l . Then, classify xA(t) to node v i if llxA(t) v i 1 1 < llx&,(t) - Vk11,Vk # i, and apply the antireinforced learning rule to the corresponding cluster centers as follows:
Vi(t
+ 1) = V i ( t ) - rl(t)[x:,(t)- V i ( t ) l ,
(3)
where ~ ( t is ) the learning constant which decreases monotonically with the number of iterations t , 0 5 q ( t ) 5 1. After several passes through the training data, the node vectors typically converge, and the modification process is complete. Note that we desire to move the prototypes slightly.
where
Uk
259
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.
and. P = 0.5 being an overlapping factor. The estimated function output F ( x ) for x is then given as:
K
F(x) =
k=l
G k (x, v k ok)
(6)
Intuitively, the RBFN performs similarity evaluation by linearly combining the output G k . If the current input vector x (associated with an image in the database) is close to one of the RBF centers V k (i.e., the k-th prototype) in a Euclidean sense, the corresponding RBF unit G k given by Eq (4) will increase, and the estimated output F ( x ) will also increase indicating the greater similarity of the corresponding image. On the other hand, those input vectors that are far away from each of V k , k = 1 , . . .,K do not appreciably contribute to the summation due to the exponentially decaying weighting function.
260
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.
Method
NN
0 Iter.
73.71 67.10
Iter.
2 Iter.
86.53 77.76
Iter.
RFM
83.72 75.74
87.55 78.46
Table 1: Average retrieval rate (%) of 116 query images on the Brodatz database.
Figure 1: left: (a) the retrieval results before learning similarity in answer to the query 024-172.JPG; right: (b) the retrieval resuls after learningsimilarity.
26 1
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.
Query
0 Iter.
1 Iter. 14 10 5 5 11
2 Iter.
14
10
Query
046-160.JPEG 065-200.JPEG 058-143.JPEG 076-126.JPEG 040-123.JPEG
6 7 12
0 Item 4 3 8
4
1 Iter. 6 7 11 6 10
2 Iter. 6 7 10
9
Table 2: Retrieval results on the JPEG database (4678 images) using the DCT energy histogram feature. The table shows the number of relevant image in the top 15 matches.
CONCLUSION
We have proposed the adoption of neural network techniques for characterizing the behavior of human users in an interactive session where relevance feedback is applied. With the SOTM/VQ algorithms, a more effective local analysis of relevance is provided than with another better-known model: query modification approach. The network structure also includes a single-pass RBFN to perform a more accurate non-linear evaluation of image similarity for ranking purposes. Based on a simulation performance comparison, this proposed learning-based approach appears to be highly effective for retrieval application on compressed and uncompressed-domain image retrievals.
References
H.Kong and L.Guan, Self-organizing tree map for eliminating impulse noise with random intensity distributions, J . of Electronic Imaging, vo1.7/1, pp.36-44, 1998.
T. Sigitani, Y. Liguni, H. Maeda, Image interpolation for progressive transmission by using radial basis function networks, I E E E Trans. on Neural Networks, ~01.10/2,pp. 381-390, 1999.
S. Haykin, Neural Networks: a Comprehensive Foundation, Prentice Hall, NJ, 1999.
Y. Rui, T.S. Huang, and S. Mehrotra, Content-based image retrieval with relevance feedback in MARS, Proc. I E E E Int. Conf. on Image Processing, pp. 815-818, 1997.
http://vivaldi.ece.ucsb.edu/users/wei/codes/thml, 1999.
B. S. Manjunath and W. Y. Ma, Texture features for browsing and retrieval of image data, I E E E Trans. of Pattern Analysis and Machine Intelligence, vol. 18/8, pp. 837-842, 1996.
Media Graphics International, http: //www,media-graphics.net . Photo Gallery
5,000
Vol.1
CDROM,
J.A. Lay and L. Guan, Image retrieval based on energy histogram of the low frequency DCT coefficients, Proc. of the I C A S S P , pp.3009-3012, 1999.
262
Authorized licensed use limited to: NARESUAN UNIVERSITY. Downloaded on January 5, 2010 at 04:46 from IEEE Xplore. Restrictions apply.