International Journal of Computer
Engineering
and Technology (IJCET),
ISSN 0976INTERNATIONAL
JOURNAL
OF COMPUTER
ENGINEERING
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
& TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 3, May-June (2013), pp. 340-352
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
©IAEME
DYNAMIC HAND GESTURE RECOGNITION USING CBIR
Mr.Shivamurthy.R.C
Research Scholar, Department of Computer Science & Engineering
Akshaya Institute of Technology, Tumkur-572106, Karnataka, India
Dr. B.P. Mallikarjunaswamy
Professor, Department of Computer Science & Engg,
Sri Siddharatha Institute of Technology, Maralur, Tumkur: 572105, Karnataka, India
Mr.Pradeep kumar B.P.
Department of Electronics & communication Engineering
Akshaya Institute of Technology, Tumkur-572106, Karnataka, India
ABSTRACT
Image Databases and archives provide lot of research areas. Significant among them
is ; The Contentbased image retrieval (CBIR) research area for manipulating large amount of
image databases and archives. CBIR is mainly based on the way that the image is extracted.
The main focus of the proposed system is on the color and shape feature extractions for hand
tracking that is intended as a step towards palm and face tracking for a perceptual user
interface. This paper extends a default implementation to allow tracking on type of feature
spaces and arbitrary number by reviewing the k-means clustering algorithm. We weigh the
multidimensional histogram with a simple monotonically decreasing kernel profile prior to
histogram back projection in order to compute the new portability that a pixel value belongs
to the target model. We examine the effectiveness of the K-means clustering algorithm as a
general-purpose hand and face tracking approach in the case where no assumptions have been
made about the palm to tracked in this paper image retrieval
Keywords: Content–based image retrieval (CBIR), histogram back projection, k means
clustering algorithm
340
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
1.INTRODUCTION
There has been a significant growth in the IT field for medical imaging where various
techniques and process are involved to create images of human body for clinical procedure.
This rapid growth has kept the medical science at a very high level. The developments of
large database of medical image are the results of collaborative approaches of handling
medical procedures. An intelligent, fast and accurate medical image retrieval system would
be the ultimate goal of medical imaging for it to be succeeded. With rapidly growing data in
size and semantically distinguishable as the features of various medical images are fuzzy in
nature for different organs, the retrieval system should be very adaptive.
As compared to general purpose images (GPI), the medical images are distinguished
in its characteristics. Hence, in medical image processing systems the process adopted for
searching GPI cannot be adopted.The most visually similar images to a give query image
from a database of images are importantly adopted in context-based image retrieval (CBIR).
CBIR will assist him/her in diagnosis; it does not target at replacing the physician by
predicting the disease of a particular case. The diagnostic information can be derived by the
visual characteristics of a disease. Sometimes it can also be derived out of similar images
correspond to the same disease category. The physician can use the output of CBIR system to
obtain more confidence in his/her decision or to consider other possibilities as well.
The advances in CBIR systems have led the researchers for new approaches in
information retrieval for image databases. It has already met some degree of success in
constrained problems in medical applications.Not withstanding the progress already achieved
in the few frameworks available here is still a lot of work to be done in order to develop a
commercial system able to fulfill image retrieval/diagnosis comprehending a broader image
domain.
Recently, advances in Content Based Image Retrieval prompted researchers towards
new approaches in information retrieval for image databases. In medical applications it
already met some degree of success in constrained problems. The generic framework is
shown in Figure 1.
Figure 1: Diagram for content-based image retrieval system
341
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
Notwithstanding the progress already achieved in the few frameworks available here
is still a lot of work to be done in order to develop a commercial system able to fulfil image
retrieval/diagnosis comprehending a broader image domain.
2.METHODS USED FOR IMPLEMENTING CBIR
2.1Shape based Method
The image feature extracted for shape based image retrieval is usually an Ndimensional feature vector that can be regarded as a point in N-dimensional space. After the
images are stored into the database using the extracted feature vectors, the image will be
retieved to determine the similarity between the query image and the target images in the
databases. This is essentially the determination of distance between the feature vectors that
represents the images.
The desirable distance measure should reflect human perception. In image retrieval
various similarity measure have been exploited. We have used Euclidean distance for
similarity measurement in our implementation.
2.2Texture Based Method
We find larger variety of texture measures as compared to color measures. Wavelets
and Gabor filters are some of the common measures for capturing the texture of images
where Gabor filters perform better and correspond well to. The characteristics of images or
image parts with respect to changes in certain directions and scale of changes are capture by
the texture measure. This is most useful for regions or images with homogeneous texture.
2.3Using Low-Level Visual Features
Preprocessing phase and retrieval phase are the two main phases of the image
retrieval process. The description of of both phases as follows.
A feature extraction model and a classification model are the two main components of
pre-processing phase.The original image database i.e. images from the image CLEF medical
collection, with more than 66,000 medical images is the input of the preprocessing phase. An
index relating each image to its modularity and a feature database is the output of the
preprocessing phase.
3.LOW LEVEL IMAGE FEATURE
Content-based image retrieval is based principally on low level image feature. As it
has been found that users are usually more interested in specific region rather than entire
image, most current content-based image retrieval systems are region based. This means that
the image is divided in regions on whuich the other operation are
performed. To carry out this first step we need to perform a segmentation of the original
image. To specify queries we will consider several classes of features: color, texture, shape.
3.1 Image segmentation
Automatic image segmentation is a difficult task to compute. A lot of techniques have
been proposed in the past, such as curve evolution [11] and graph partitioning [12]. A lot of
known techniques works well for images that have homogeneous color regions. Let's see an
example of segmentation on one picture in figure 2. However, picture from the real world are
342
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
richer of color and shades. In literature there are know many segmentation algorithm, like
JSEG [10], blobworld [9] or KMCC [13].
Figure 2: Example of segmentation of a picture. (the red lines divided the regions)
3.2 Color feature
Color features are easy to obtain, often directly from the pixels intensities like color
histogram over the whole image, over a fixed sub image, or over a segmented region. It is one
of the most used features in image retrieval. The colors are described by their color space:
RGB, LAB, LUV, HSV,
RGB is the best known space color and is it commonly used for visualization. The acronym
stands for Red Green Blue. This space color can be seen as a cube where the horizontal x-axis
as red values increasing to the left, y-axis as blue increasing to the lower right and the vertical
z-axis as green increasing towards the top, as in figure 3.
Figure 3: The RGB color model mapped to a cube.
The origin, black, is hidden behind the cube. RGB is a convenient color model for
computer graphics because the human visual system works in a way that is similar, though
not quite identical, to an RGB color space.
Another famous space color is the HSV. The acronym stands for Hue Saturation
Value. Refering to the image 3 we can see the color space as a cylinder, where the angle
around the central vertical axis corresponds to hue, the distance from the axis corresponds to
saturation, and the distance along the axis corresponds to value (also called brightness).
343
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
Figure 4: The HSV space color.
3.3 Shape feature
There's no universal definition of what a shape is. Impressions of shape can be
conveyed by color or intensity patterns, or texture, from which a geometrical representation
can be derived, like Plato's Meno, where a Socratic dialog is made around the word “figure".
Figure is the only existing thing that is found always following color – Socrate. Shape feature
(like aspect ratio, circularity, Fourier descriptor, consecutive boundary segments, . . . ) are
very important image feature, even if they are not so commonly used in Region-Based Image
Retrieval Systems. Due to the inaccuracy of segmentation step, it is more difficult to apply
shape features instead of color or texture feature. However in literature are know some
Content-Based Image Retrieval Systems that use this feature, like [15], [13] e [14].
4. INTRODUCTION TO GESTURE
Human hand gestures have been a mode of nonverbal interaction widely used. It
ranges from simple action of using our finger to point at and using hands to move objects
around for more complex expressions for the feelings and communicating with others. The
pursuance for the Human Computer Interaction research is moved by the central dogma of
removing the complex and cumbersome interaction devices and replacing them with more
obvious and expressive means of interaction by minimizing interaction
4.1 Background Registration and foreground segmentation
When making skin color detection in real situation where it is very important process
in gesture recognition, so that divide the images into foreground and background regions
according to the dynamic characteristic of the film. The background regions such as face,
neck, areas of adjacent skin color, etc. do not change during the recognition process, In
addition to the background region, we call the hand gestures area foreground region. We
subtract the background from the images to get foreground regions. The foreground image is
used to conduct "and" operation with skin color images to get a more accurate image of hand
region.
Differential techniques are used in video image processing, that is current image
subtracts background image, the result image is called difference foreground image. Select
appropriate threshold. The difference foreground image was then be binarized to produce
foreground binary image. The basic principle of differential image processing is to carry out
344
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
differences calculation between the image after gray scale transformation in the detect area
with the background image in the spatial domain.
That can be expressed as
Where f(x,y,ti) and f(x,y,tj) represent the brightness values of the pixel at position of
(x,y) on the moment of ti and tj respectively. The range of the value is from 0 to 255. The
significant differences of pixel's brightness at position (x,y) is described as:
Where mk and sk(k=i,j) are the mean and variance of f(x,y,tk) within a small
neighborhood q(x,y) at position of (x,y). t is a threshold.
Segmentation is a process of partitioning an image into multiple segments based on
certain attributes. The ultimate goal of the segmentation is to convert the image into a
simplified form that is more useful as compared to the original image. The results of multiple
segmentation techniques depend upon the requirement for segmentation. Background
subtraction provides an effective means of segmenting objects moving in front of a static
background. Researchers have traditionally used combinations of morphological operations
to remove the noise inherent in the background-subtracted result.
4.2 Tracking
Tracking starts with interest of color space used for skin modeling. RGB is a
convenient color model for computer graphic because the human visual system works in a
way that is Similar to an RGB color space, when it was convenient to express color as a
combination of three based colored rays (red, green and blue). Normalized RGB skin color
model is considered to be more proper for hand skin, and can be easily obtained from the
RGB values by a simple normalization processing.
Format of YCbCr has been considered to be better in describing the properties than
RGB color space The clustering characteristic for YCbCr is better than RGB YCbCr is used
to separate out a luminance signal (Y) and two chrominance components (Cb and Cr). YCbCr
can challenge various illumination conditions by discarding the signal Y, which not only
improve the performance and also reduce the data dimension than RGB.
345
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
The properties of skin color can be characterized by Gaussian distribution. The single
Gaussian model is one of the simplest models to model the distribution of the certain objects
which is widely used in computer vision and pattern recognition. Gaussian distribution is
given by
Where
is the mean value of the samples and
the variance value. Using the
Gaussian model to model skin color is actually a process that matching each pixel of the
image to the model. If matched, it is consider the pixel as a skin pixel, else it is consider it
background.
Highlighting of the hand posture particularly in the film is done considering several steps
including skin detection are camshaft algorithm, calibration, motion detection etc.
5. BLOCK DIAGRAM
Figure 5: Proposed Hand Gesture Recognition System
Gesture recognition is important for developing an attractive alternative to prevalent
human–computer interaction modalities. The system design shown above depicts the
techniques used for designing a dynamic user interface which initializes by acquiring image
346
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
or video of the gestures from user. The different hand gestures considered are the sequences
of distinct hand gestures. A given hand gesture can undergo motion and discrete change.
These gestures are distinguished based on the nature of motion. The real time
recognition engine is developing which can reliably recognize these gestures despite
individual variations. The engine also has the ability to detect start and end of gesture
sequences in an automated fashion.
6. ALGORITHMS
6.1 Algorithm for calibration
In calibration part, take the snapshot of the hand region from the webcam then prompt
user has to select the hand region from the snapshot. Convert the input image from RGB
color format to LAB and HSV format for further operation on selected skin region. Then
calculate and store the mean values of A, B, H&S[7].
6.2 Skin Detection
Skin detection is very important process in detecting the hand posture so to perform with
skin detection we need to do calibration to get hand color pixels as samples then
comparing those pixels with current image to detect the hand. Hence following
processing steps can be observed in skin detection.
a. Load color samples, and update the background so as to take the median. Calculate
the mean 'a*' and 'b*' value for each area that you extracted with roipoly. These values
serve as your color markers in 'a*b*' space.
b. Trigger the image and take the difference of current image and the background.
watever the output we obtain we have to take it in 3 dimension.
c. Next process is Thersholding. This is in simple converting image into binary. Then
doing some morphological operations like Dilation defined as some kind of
expansion.Fill the circular region.
d. Convert the output to LAB format and HSV format where LAB defined as L for
luminosity, A for chrominocity layer a, B for chrominocity layer b.
e. Calculate Euclidian distance between input pixels and sample values, these values are
nothing but the inputs A,B,am,bm which are stored in calibration.
distance1 = ((a - am).^2 + (b - bm).^2 ).^0.5;
distance2 = ((H - hm).^2 +(S - sm).^2 ).^0.5;
If distance between the pixels is less than threshold value 15 for LAB
format(D1<THRESHOLD (15)) then image mask 1 is generated.
f. If distance between the pixels is less than threshold value 0.5 for HSV
format(D1<THRESHOLD (0.5)) then image mask 2 is generated.
Distance between the pixels is less if we have similar pixels and more if pixels are
different. In terms of HSV if the distance is less than 0.1 then it is matching. Get BW and
BW2. Get the mask for skin segmentation. Perform OR operation for both color space
segmentation output so that to get the skin mask [8].
347
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
7. RESULTS AND DISCUSSION
Figure 6: Acquired Image with subplot
Figure 7: A sample output after background registration and foreground segmentation
348
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
Figure 8: Selecting ROI to get skin color samples in calibration
Figure 9: Dynamic single palm Picture with Segmentation
Figure 10: two palm after skin segmentation
349
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
Figure 11: detection of palm and face after skin segmentation
8. INTERPRETATION OF RESULTS
Figure 6 shows the hand figure, is captured through the camera by using the Matlab
Program. Figure 8 shows the hand figure with area selected. Here, the values from area
selected are stored for the dynamic hand segmentation depending on the values (color)
calculated earlier. Figure 7 A sample output after background registration and foreground
segmentation. Figure 9,10 shows the dynamic segmentation by matching the color values.
Pictures are saved while hand in motion dynamically. Figure 11 shows there will be a
detection of face also due to the skin color segmentation.
9. CONCLUSION
This paper describes the design and implementation of a bare dynamic hand gesture
recognition system using just one color camera. The developed system can obtain a high
recognition rate of bare hand gestures. Our current research mainly focuses on the single
hand static gestures for the simplicity. However, we are far from building a general-purpose
gesture recognition system. The extensive experiments and evaluation in outdoor
environment where more uncertainties such as changing backgrounds, sunshine and shadows
may bring the hand gesture complicated. The dynamic gestures and two-handed gestures
should also be explored in the future, since they are more expressive and allow more natural
interaction. the system implementation shown above adopted by the detection of hand or
highlighting hand posture in the input image using calibration, motion detection, and resulted
with segmented hand posture is defined with results.
As in case of dynamic recognition technics video processing needs speed and higher
intensity in providing experience of virtual and real time experience.
350
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
REFERENCES
[1]
[2]
[3]
[4]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
Yoo-Joo Choi, Je-Sung Lee and We-Duke Cho, “A Robust Hand Recognition In
Varying
Illumination,” Advances in Human Computer Interaction, Shane Pinder
(Ed.), 2006.
N. Conci, P. Cerseato and F. G. B. De Natale, “Natural Human- Machine Interface
using an Interactive Virtual Blackboard,” In Proceeding of ICIP 2007, pp. 181-184,
2007.
S. Mitra, and T. Acharya, “Gesture Recognition: A survey,” IEEE Transactions on
Systems, Man and Cybernetics (SMC) - Part C: Applications and Reviews, vol. 37(3),
pp. 211-324, 2007
Xiujuan Chai, Yikai Fang and Kongqiao Wang, “Robust hand gesture analysis and
application in gallery browsing,” In Proceeding of ICME, New York, pp. 938-94,
2009.[5] Ayman Atia and Jiro Tanaka, “Interaction with Tilting Gestures in Ubiquitous
Environments,” In International Journal of UbiComp (IJU), Vol.1, No.3, 2010.
José Miguel Salles Dias, Pedro Nande, Pedro Santos, Nuno Barata and André Correia,
“Image Manipulation through Gestures,” In Proceedings of AICG’04, pp. 1-8, 2004.
Pradeep kumar B.P Shailesh M.L, Shankar B.B, , Kumudeesh.K.C. “Dynamic Hand
Gesture Recognition”, International Journal of Graphics And Image Processing,
published in ijgip, volume 2, issue 1, feb 2012
Pradeep kumar B.P “Design and development of human computer interface (HCI)
system based on gesture recognition using svm”, International Journal of Graphics And
Image Processing, published in ijgip, volume 2, issue 2, may 2012
C. Carson, S. Belongie, H. Greenspan, and J. Malik. Blobworld: im-age segmentation
using expectation-maximization and its application to image querying. Pattern Analysis
and Machine Intelligence, IEEE Transactions on, 24(8):1026{1038, 2002.
Y. Deng and B. S. Manjunath. Unsupervised segmentation of color- texture regions in
images and video. Pattern Analysis and Machine Intelligence, IEEE Transactions on,
23(8):800{810, 2001.
Haihua Feng, D.A. Castanon, and W.C. Karl. A curve evolution approach for image
segmentation using adaptive ows. In ICCV 2001. Proceedings. Eighth IEEE
International Conference on computer vision, volume 2, 2001.
W.Y. Ma and B.S. Majunath. Edge ow: a framework of boundary detection and image
segmentation. In ICCV 2001. Proceedings. Eighth IEEE International Conference on
computer vision, 1997.
V. Mezaris, I. Kompatsiaris, and M. Strintzis. An ontology approach to object-based
image retrieval, 2003.
Christopher Town and David Sinclair. Content based image retrieval using semantic
visual categories. Technical report, 2001.
J. Wang, J. Li, D. Chan, and G. Wiederhold. Semantics-sensitive retrieval for digital
picture libraries. D-LIB Magazine, 5(11), November 1999.
Shaikh Shabnam Shafi Ahmed, Dr.Shah Aqueel Ahmed and Sayyad Farook Bashir,
“Fast Algorithm for Video Quality Enhancing using Vision-Based Hand Gesture
Recognition”, International Journal of Computer Engineering & Technology (IJCET),
Volume 3, Issue 3, 2012, pp. 501 - 509, ISSN Print: 0976 – 6367, ISSN Online: 0976 –
6375.
351
International Journal of Computer Engineering and Technology (IJCET), ISSN 09766367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME
ABOUT THE AUTHOR
Mr. Shivamurthy R C received the BE degree from PDA college of
Engineering, Gulbarga University and received the M.Tech degree in
Computer Science & Engineering from Malnad College of Engineering,
Visvesvaraya Technological University, Belgaum. Currently working as
professor in the department of Computer Science at A.I.T, Tumkur,
Karnataka, and he is also a Ph.D scholar in CMJ University, India.
Dr.B.P Mallikarjunaswamy. working as a professor in the
Department of Computer Science & Engineering, Sri Siddhartha Institute
of Technology, affiliated to Sri Siddhartha University. He has more than 20
years of Experience in teaching and 5 years of R & D. He is guiding many
Ph.D scholars. He has published more than 30 technical papers in national
and International Journals and conferences. His current research interests
are in pattern Recognition and Image Processing.
Mr.Pradeep Kumar.B.P, He is presently working as a Assistant
Professor in Akshaya Institute of Technology, Tumkur and perusing his
PhD in Jain university in electronics & communication Engineering
department. He has published more than 20 technical papers in national and
International Journals and conferences. His areas of interests are signal
processing, Medical Imaging, pattern recognition, video processing,
Multimedia Communication systems.
352