Sec 2 - Team-9-Matlab
Sec 2 - Team-9-Matlab
Sec 2 - Team-9-Matlab
CERTIFICATE
This is to certify that the thesis entitled “FACE ANTI SPOOFING USING SPEEDED
ROBUST FEATURES AND FISHER VECTOR ENCODING” is being submitted by
P.PRASANTHI, R.JAGATHI, N.GAYATHRI SHARMILA DEVI, K.MOUNIKA,
P.PRAVALIKA in partial requirement for award of degree of BACHELOR OF
TECHNOLOGY in ELECTRONICS AND COMMUNICATION ENGINEERING branch
to KAKINADA INSTITUTE OF ENGINEERING AND TECHNOLOGY FOR WOMENS
Affiliated to Jawaharlal Nehru Technological University, Kakinada is a record of bonfire
work carried out by them under my guidance and supervision.
The results embodied in this thesis have not been submitted to any other university or
institute for the award of any degree or diploma.
EXTERNAL EXAMINE
ACKNOWLEDGEMENT
It gives us immense pleasure to acknowledge all those who helped us throughout in making this
project a great success.
With profound gratitude we thank Mr. Y. RAMA KRISHNA, M. Tech, MBA, Principal,
Kakinada Institute of Engineering and Technology for Women, for his timely suggestions which
helped us to complete this project work successfully.
Our sincere thanks and deep sense of gratitude to MS. P. LATHA, M. Tech, Head of the
Department ECE, for his valuable guidance, in completion of this project successfully.
We are thankful to both Teaching and Non-Teaching staff members of ECE department
for their kind cooperation and all sorts of help bringing out this project work successfully.
P.PRASANTHI - 17JN1A0481
R.JAGATHI - 17JN1A0454
N.GAYATHRI SHARMILA DEVI - 17JN1A0458
K.MOUNIKA - 17JN1A0476
P.PRAVALIKA - 18JN5A0413
DECLARATION
We hereby declare that the project work “FACE ANTI SPOOFING USING SPEEDED
ROBUST FEATURES AND FISHER VECTOR ENCODING” submitted to the JNTU
Kakinada, is a record of an original work done by us under the guidance of MS. P.LATHA, M.
Tech, Asst. Professor, Electronics & Communication Engineering. This project work submitted
in partial fulfilment of the requirement for the award of the degree of Bachelor of Technology in
Electronics & Communication Engineering. The results embodied in this project report have not
been submitted to any other University or Institute for the award of any degree or diploma.
This work has not been previously submitted to any other institution or University for the
award of any other degree or diploma.
P.PRASANTHI - 17JN1A0481
R.JAGATHI - 17JN1A0454
N.GAYATHRI SHARMILA DEVI - 17JN1A0458
K.MOUNIKA - 17JN1A0476
P.PRAVALIKA - 18JN5A0413
ABSTRACT
List of Figures I
List of Acronyms II
Abstract III
1.1 INTRODUCTION 1
2. LITERATURESURVEY 9-14
3.5 DRAWBACKS 22
5.1 IMAGE 45
6. RESULTS 55-57
ADVANTAGES&APPLICATIONS 58
REFERENCES 64-67
LIST OF FIGURES
patterns
5 4.1 Proposed Architecture 28
14 5.4 GUI 61
FV Fisher Vector
FL Fuzzy Logic
CHAPTER -1
INTRODUCTION
1.1 INTRODUCTION:
There is an explosion of social media content available online, such as Flicker, YouTube
and Zoom. Such media repositories promote users to collaboratively create evaluate and
distributed media information. They also allow users to annotate their uploaded media
data with descriptive keywords called tags. As an example, Fig. 1 illustrates a
socialimageanditsassociateduser-providedtags.Thesevaluablemetadatacangreatlyfacilitate
the organization and search of the social media. By indexing the images with associated
tags, images can be easily retrieved for a given query. However, since user-provided tags
are usually noisy and incomplete, simply applying text-based retrieval approach may lead
to unsatisfactory results. Therefore, a ranking approach that is able to explore both the
tags and images‘content is desired to provide users better social image search results.
Currently, Flicker provides two ranking options for tag-based image search .One is―most
recent‖, which orders images based on the reuploading time, and the other is
―most interesting‖, which ranks the images by―interestingness‖,a measure that integrates
the information of click-through, comments, etc. In the following discussion, we name
these two methods time-based ranking and interestingness-based ranking, respectively.
They both rank images according to measures (interesting nesses time) that are not
related to relevance and it results in many irrelevant images in the top search results. As
an example, Figure illustrates the top results of query―waterfall‖with the two ranking
options, in which we can see that many images are irrelevant to the query, such as those
marked with red boxes. In addition to relevance, lack of diversity is also a problem.
Manyimagesfromsocialmediawebsitesareactuallyclosetoeachother.Forexample, several
Users get used to upload continuously captured images in batch, and many of them are
visually and semantically close. When these images appear simultaneously as top results,
users will get only limited information. we can also observe this fact, the images marked
with blue or green boxes are very close to at least one of the other images. Therefore, a
ranking scheme that can generate relevant and diverse results is highly desired. This
problem is closely related to a key scientific challenge that is recently released by Yahoo
The necessity of diversity may seem less intuitive than relevance, but its importance
has also been long acknowledged in information retrieval [9, 8]. One explanation is that
the relevance of a document (can be a web page, image or video) with respect to the
query should depend on not only the document itself but also its difference with the
documents appearing before it. Now we observe this issue from another perspective. In
many cases users cannot accurately and exhaustively describe their requests, and thus
keeping diversity of the search results will provide users more chances to find the desired
content quickly.WiththedevelopmentofsocialmediabasedonWeb2.0,amounts of images
and videos spring up everywhere on the Internet. This phenomenon has brought great
challenges to multimedia storage, indexing and retrieval. Generally speaking, tag-based
image search is more commonly used in social media than content based image retrieval
and content understanding Thanks to the low relevance and diversity performance of
initial retrieval results, the ranking problem in the tag-based image retrieval has gained
researchers‘wide attention.
Ranking [3, 7] approaches have been dedicated to overcome it. As for the ―query
ambiguity‖ problem, an effective approach is to provide diverse retrieval results that
cover multiple topics underlying a query. Currently, image clustering [10] and duplicate
removal [5-6] are the major approaches settling the diversity problem. However,
mostoftheliteratureregardsthediversityroblemastopromotethevisualdiversityperformance,
but the promotion of the semantic coverage is often ignored. To diversify the top ranked
search results from the semantic aspect, the topic community belongs to each in image
should be considered. In recent years, more and more scholars pay attention to retrieval
result‘s diversity [8]. In [5], the authors first apply graph clustering to assign the images
to clusters, and then utilize random walk to obtain the final result. The diversity is
achieved by set the transition probability of two images in different clusters higher than
that in the same cluster. Tina et al. think the topic structure in the initial list is hierarchical
[6]. They first organize images to different leaf topic, then define the topic cover score
based on topic list, and finally use a greedy algorithm to obtain the highest topic cover
score list. Dang-Nguyen et al. [7] first propose a clustering algorithm to obtain a topic
tree, and then sort topics according to the number of images in the topic. In each cluster,
the image uploaded by the user who has highest visual score is selected as the top ranked
image. The second image is the one which has the largest distance to the first image. The
third image is chosen as the image with the largest distance to both two previous images,
and so on. In our previous work [8], the diversity is achieved based on social user re-
ranking. We regard the images uploaded by the same user as a cluster and we pick one
Image from each cluster to achieve the diversity, Most papers consider the diversity from
visual perspective and achieve it by applying clustering on visual features [9].In this
Paper, we focus on the topic diversity. We first group all the tags in the initial retrieval
image list to make the tags with similar semantic be the same cluster, and then assign
images into different clusters. The images within the same cluster are viewed as the ones
with similar semantics. After ranking the clusters and images in each cluster, we select
one image from each cluster to achieving our semantic diversity. Many commercial
imagesearchenginesintheinternetuseonlykeywordsasqueries.Userstypequerykeywordsas
input in the hope of finding a certain type of images they search for. The search engine
returns images in thousands that are ranked by the keywords extracted from the
surrounding text. It is well known that text-based image search process suffers a lot from
the ambiguity of query keywords. The keywords provided by users tend to be short and
mostly not commonly known. For example, the average query length of the top 2,
000queries of Pica search is 1.369 words, and 95% of them contain only one or three
words [1]. They cannot describe the content of images accurately and perfectly. The
search results are noisy and ambiguous consist of images with quite different semantic
meanings. Fig1 shows the top ranked images that are ranked from Bing image search
using―Jaguar‖as query. They belong to different categories, such as―Blue Jaguar car‖,
―Black Jaguar car ‖, ―Jaguar logo‖, and ―Jaguar animal ‖, due to the ambiguity of the
word ―Jaguar‖. The ambiguity issue occurs for so many reasons. First, the query
keywords that the user searching for, meanings may be richer than users ‘expectations.
Consider this, the meanings of the word―Jaguar‖includes Jaguar animal and Jaguar car
and Jaguar logo. Second, the user may not have enough knowledge about the
textualdescriptionoftargetimageshe/shesearchingfor.Themostimportantly, in many
scenarios, it is difficult for users to explain the visual content of queried images using
keywords accurately. In order to solve the ambiguity issues, additional information has to
be used. One way is text-based keyword expansion that makes the textual description of
the query more detailed. Existing linguistically-related methods find either synonyms and
other linguistic-related words from the Saugus state, or finds words as frequently co-
occurring with the query keywords. However, the interaction between the user and the
system has to be as simple as possible. The minimum criteria is that a One- Click. In this
paper, we propose a kind of novel Internet image search approach. It just requires the user
to give only a click on a query image and images from a dataset or a pool is retrieved by
text-based search are re-ranked based on their visual and textual similarities to the query
image searching for. The users will tolerate one-click interaction which has been used by
many famous text-based search engines. For example, in Google it requires abuser to
select a suggested textual query expansion by one-click to get additional results as output.
The problem solved in this paper is how to capture user intention from this one-click
query image. Web-image search has become a key feature of well-known
searchenginessuchas`Google',`Yahoo',`Bing',etc.Givenatextquery,thesearchenginehasto
go through millions of images for retrieving, as quickly as possible, the relevant ones.
Most of these search engines are primarily based on the use of text meta-data such as
keywords, tags, and/or text descriptions nearby the images. Since the meta-data do not
always correspond to the visual content of the images, the retrievals are usually mixed
upwithundesirablenon-relevantimages.However,ithasbeenobservedthattheso-retrieved
images contains enough relevant images { they are made for users that are in general
more interested by precision than recall { and that the precision can be improvedbyre-
rankingtheinitialsetofretrievedimages.Thisre-rankingstagecanbene_tfrom
The use of the visual information contained in the images, as shown by [15]. Web-image
re-ranking can be seen as a binary classification problem where the relevant images
belong to the positive class. Although true labels are not provided, it is still possible to
build class models based on the two following assumptions: (I) the initial text based
search provides a reasonable initial ranking, which is to say that a majority of the top-
ranked images are relevant to the query, meaning that classier such as SVMs can be
trained by using the top-ranked images as (noisy) positive images while the images that
are ranked below or even the images from other datasets are treated as negative
images(see e.g. [4]). (ii) The relevant images are visually similar to each other (at least
with in groups) while the non-relevant images tend to be not similar to any other images.
Graph based re-ranking approaches exploit this second assumption, by modeling the
connectivity among retrieved images [18]..Recent research has demonstrated that sparse
coding (or sparse representation) is a powerful image representation model. The idea is to
represent an input signal as a linear combination of a few items from an over-complete
classification that performs low-rank recovery class by class during training, our
methodprocessesalltrainingdatasimultaneously.Comparedtootherdictionarylearningmetho
ds [12] that are very sensitive to noise in training images, our dictionary
learningalgorithmisrobust.Contaminatedimagescanberecoveredduringourdictionarylearni
ngprocs..
CHAPTER-2
LITERATURE SURVEY
Many Internet scale image search methods [11]–[17] are text-based and are limited by the
fact that query keywords cannot describe image content accurately. Content-based image
retrieval uses visual features to evaluate image similarity. Many visual features [5]–[9]
were developed for image search in recent years. Some were global features such as
GIST [5] and HOG [6]. Some quantized local features, such as SIFT [13], into visual
words, and represented images as bags-of-visual- words (Bob) [8]. In order to preserve
the geometry of visual words, spatial information was encoded into the Bob model in
multiple ways. For example, Zhang et al. [9] proposed geometry-preserving visual phases
which captured the local and long-range spatial layouts of visual words. One of the
major challenges of content-based image retrieval is to learn the visual similarities which
will reflect the semantic relevance of images. Image similarities can be learned from a
large training set where the relevance of pairs of images is known [10]. Deng et al. [11]
learned visual similarities from a hierarchical structure defined on semantic attributes of
training images. Since web images are highly diversified, defining a set of attributes with
hierarchical relationships for them is challenging. In general, learning a universal visual
similarity metric for generic images is still an open problem to be solved. Some visual
features may be more effective for certain query images than others. In order to make the
visual similarity metrics more specific to the query, relevance feedback [12]–[16]
waswidelyusedtoexpandvisualexamples.Theuserwasaskedtoselectmultiplerelevantand
irrelevant image examples from the image pool. A query-specific similarity
metricwaslearnedfromtheselectedexamples.Forexample,in[12]–
[14],[16],[17],discriminative models were learned from the examples labeled by users
using
supportvectormachinesorboosting,andclassifiedtherelevantandirrelevantimages.In[21]the
weightsofcombiningdifferenttypesoffeatureswereadjustedaccordingtousers‘feedback.Sinc
ethenumberofuser-labeledimagesissmallforsupervisedlearningmethods, Huang et al. [15]
proposed probabilistic hyper graph ranking under the semi-supervised learning
framework. It utilized both labeled and un- labeled images in the learning procedure.
Relevance feedback required more users‘effort. For a web-scale commercial system
users‘ feedback has to be limited to the minimum, such as one-
clickfeedback.Inordertoreduceusers‘burden,pseudorelevancefeedback[18],[19]expanded
the query image by taking the top N images visually most similar to the query image as
positive examples. However, due to the well- known semantic gap, the top N images may
not be all semantically-consistent with the query image. This may reduce the performance
of pseudo relevance feedback. Chum et al. [8] used RANSAC to verify the spatial
configurations of local visual features and to purify the expanded image examples.
However, it was only applicable to object retrieval. It required users to draw the image
region of the object to be retrieved and assumed that relevant images contained the same
object. Under the framework of pseudo relevance feedback, Ah- Pine ET proposed trans-
media similarities which combined both textual and visual features proposed the query-
relative classifiers, which combined visual and textual information, to re-rank images
retrieved by an initial text-only search. However, since users were not required to select
query images, the users‘intention could not be accurately captured when the semantic
meanings of the query keywords had large diversity. We conducted the first study that
combines text and image content for image search directly on the Internet, where simple
visual features and clustering algorithms were used to demonstrate the great positional of
suchanapproach.Followingourintentimagesearchworkin[1]and[2],avisualquery
Suggestion method is developed. Its difference from [1] and [2] is that instead of
askingtheusertoclickonaqueryimageforre-ranking,thesystemasksuserstoclickonalistof
keyword-image pairs generated off-line using a dataset from Flicker and search image
son the web based on the selected keyword. The problem with this approach is that on
one hand the dataset from Flicker is too small compared with the entire Internet thus
cannot cover the unlimited possibility of Internet images and on the other hand, the
keyword-image suggestions for any input query are generated from the millions of
images of the whole dataset, thus are expensive to compute and may produce a large
number of unrelated keyword- image pairs. Besides visual query expansion, some
approaches used concept-based query expansions through map- ping textual query
keywords or visual query examples to high-level semantic concepts. They needed a pre-
KIET-W ECE Page 10
Face anti spoofing using speeded up robust features encoding and fisher vector
defined concept lexicons whose detectors were off-line learned from fixed training sets.
These approach were suitable for closed databases but not for web-based image search,
since the limited number of concepts cannot cover the numerous images on the Internet.
The idea of learning example specific visual similarity metric was explored in previous
work. However, they required training a specific visual similarity for every example in
the image pool, which is assumed to be fixed. This is impractical in our application where
the
Image pool returned byte at based search constantly changes for different query
keywords. Moreover, text information, which can significantly improve visual similarity
learning, was not considered in previous work. Searching using a combination of more
than one image feature for example region and color improves retrieval effectiveness.
Using a single-region query example is better than using the whole image as the query
example. However, the multiple-region query examples outperformed the single-region
Query example and also the whole-image example queries [8]. The Gabor filter has been
widely used to extract image features, especially texture features. It is optimal in terms of
minimizing the joint uncertainty in space and frequency, and is often used as an
orientation and scale tunable edge and line(bar)detector. There have been many
approaches proposed to characterize textures of images based on Gabor filters. In Gabor
methods, a particular set of Gabor filters (corresponding to different angles) is chosen,
which determines the quality of result in applications such as CBIR. To get rid of the
angle dependence, some types of permutations on feature matrices are taken in. In the
traditional application of Gabor filters the chosen directions may not correspond to
theorientationofthecontentsinthequeryimage.Thereforeanymethodthatextractsfeatures
independent of orientation in the image is desirable. Thus rotation invariance is
particularly useful when one wants to retrieve images having same content but
indifferentorientation.ThemodifiedGaborfunctionsuitablyinsuchawaythattheresultingfunct
ion besides inheriting good properties of Gabor filters is a Radial Basis
function(RBF),whichisanangleindependentfunction.Hencenospecificsetofanglesisrequired
for feature extraction. The main features of the present algorithm are: (a) it uses images in
Cartesian domain avoiding the nonlinear polar transformation, and certain
approximations resulting there from, (b) it does not require, unlike standard Gabor
method, direction dependent filters for the extraction of information pertaining to
different directions, which minimizes the amount of computation. Additionally, our
feature extraction procedure is independent of presence of rotation in images, and hence
is useful for rotation independent CBIR [9]. One can assume that the goal of
contentbasedimageretrievalistofindimageswhicharebothsemanticallyandvisuallyrelevant
To evaluate CBIR systems a subject with query and corresponding result image pairs.
The subject evaluates each pair as either―undecided‖,
―poor match‖, ―faint match‖, or ―good match‖. Thus evaluating the ―query by image
example‖ paradigm [10]. Several methods for retrieving images on the basis of color
similarity have been described in the literature, but most are variations on the same basic
idea. Each image added to the collection is analyzed to compute a color histogram which
shows the proportion of pixels of each color within the image. The color histogram for
each image is then stored in the database. The approach more frequently adopted for
CBIR systems is based on the conventional color histogram (CCH), which contains
occurrences of each color obtained
The approach more frequently adopted for CBIR systems is based on the conventional
color histogram (CCH),which contains occurrences of each color obtained counting all
image pixels having that color. Each pixel is associated to a specific histogram bin only
on the basis of its own color, and color similarity across different bins or color
dissimilarity in the same bin are not taken into account. Since any pixel in the image can
be described by three components in a certain color space (for instance, red, green and
blue components in RGB space or hue, saturation and value in HSV space), a histogram,
i.e. the distribution of the number of pixels for each quantized bin, can be defined for
each component. Clearly, the more in a color histogram contains the more discrimination
power it has. However, a histogram with large number of bins will not only increase the
computational cost, but will also in appropriate for building efficient indexes for image
data base.
To users based on image descriptors. These descriptors are often provided by an example
image—the query by example paradigm. The CBIR system used in this work is an
application of the system developed for modeling the joint probability of image region
features and associated text. It is not necessary to train the model on both text and image
data, and use two variants of the model one where both text and image data is used, and
one where only image data is used. To evaluate CBIR systems a subject with query
andcorrespondingresultimagepairs.Thesubjectevaluateseachpairaseither―undecided‖,
―poor match‖, ―faint match‖, or ―good match‖. Thus evaluating the ―query by image
example‖ paradigm [10]. Several methods for retrieving images on the basis of color
similarity have been described in the literature, but most are variations on the same basic
idea. Each image added to the collection is analyzed to compute a color histogram which
shows the proportion of pixels of each color within the image. The color histogram for
each image is then stored in the database. The approach more frequently adopted for
CBIR systems is based on the conventional color histogram (CCH),which contains
occurrences of each color obtained counting all image pixels having that color. Each
pixel is associated to a specific histogram bin only on the basis of its own color, and color
similarity across different bins or color dissimilarity in the same bin are not taken into
account. Since any pixel in the image can be described by three components in a certain
color space(forinstance,red,greenandbluecomponentsinRGBspaceorhue,saturation and
value in HSV space), a histogram, i.e. the distribution of the number
ofpixelsforeachquantizedbin,canbedefinedforeachcomponent.Clearly,themorebinsa color
histogram contains the more discrimination power it has. However, a
histogramwithlargenumberofbinswillnotonlyincreasethecomputationalcost, but will also
In appropriate for building efficient indexes for image data base. The conventional color
histogram with quadratic form (QF) distance as similarity measure and the fuzzy color
histogram with Euclidean Distance almost similar in their performance. But they couldn’t
respond well to shifted or translated images. In order to overcome this problem invariant
color histogram technique is used makes which use of gradients in different channels that
weight that weight the influence of a pixel on the histogram to cancel out the changes
induced by deformations. When a rotated image is given as the query, the original image
is retrieve das the closest match [11].Color and Local Spatial lFeature Histograms
KIET-W ECE Page 13
Face anti spoofing using speeded up robust features encoding and fisher vector
(CLSFH) has fewer feature indexes and can capture more color-spatial information in an
image. At the same time, as the four histograms used by CLSFH are calculated global
Lyon the image, the two local spatial statistic moment histograms and color histogram are
insensitive to image rotation, translation and scaling, the local directional difference unit
histogram is insensitive to image translation and scaling. In CLSFH, the non-uniform
quantized HSV color model is used, the mean, the standard deviation of 5x5 neighbor of
every pixel are calculated, and are used to generate the Local Mean Histogram, the Local
Standard Deviation Histogram; the Directional Difference Unit of 3 X3 neighbor of every
pixel is defined and computed, and is used to generate the Local Directional Difference
Unit Histogram. The three histograms and color histogram are used as feature indexes to
retrieve color image. So CLSFH is effective for images, especially for images with
relatively regular texture and structure characteristic [12]
CHAPTER-3
IMPLEMENTATION TECHNOLOGY
3.1 SURFWITHFISHERVECTOR
Local binary patterns (LBP) is a type of feature used for classification in computer
Vision. LBP is the particular case of the Texture Spectrum model proposed in
1990.LBPwas first described in 1994. It has since been found to be a powerful feature for
texture classification; it has further been determined that when LBP is combined with the
Histogram of oriented gradients (HOG) classifier, it improves the detection performance
Considerably on some datasets.
Local Binary Pattern (LBP) is a simple yet very efficient texture operator which labels
the pixels of an image by thresholding the neighborhood of each pixel with the value of
the center pixel and considers the result as a binary number. Due to its discriminative
power and computational implicitly, LBP texture operator has become a popular
approach in various applications. It can be seen as a unifying approach to the traditionally
divergent statistical and structural models
Of the texture analysis. Perhaps the most important property of the LBP operator in real-
world applications is its robustness to monotonic gray-scale changes caused, for example,
by illumination variations. Another important property is its computational simplicity,
which makes it possible to analyze images in challenging real-time settings.
The local binary pattern (LBP) texture analysis operator is defined as a gray-scale
invariant texture measure, derived from a general definition of texture in a local
neighborhood. The LBP operator can be seen as a unifying approach to the traditionally
divergent statistical and structural models of texture analysis.
Perhaps the most important property of the LBP operator in real-world applications is its
robustness to monotonic gray-scale changes caused, for example, by illumination
variations. Another important property is its computational simplicity, which makes it
possible to analyze images in challenging real-time settings. , LBP texture operator has
become a popular approach in various applications. It can be seen as a unifying approach
to the traditionally divergent statistical and structural models.
ThebasicideafordevelopingtheLBPoperatorwasthattwo-dimensionalsurfacetextures can be
described by two complementary measures: local spatial patterns and grayscale contrast.
The original LBP operator (Ocala et al. 1996) forms labels for the image pixels by
thresholding the 3 x 3 neighborhood of each pixel with the center value and considering
the result as a binary number. The histogram of these 28= 256 different labels can then be
used as a texture descriptor. This operator used jointly with a
simplelocalcontrastmeasureprovidedverygoodperformanceinunsupervisedtexturesegment
ation (Ocala and Pietikäinen 1999). After this, many related approaches have been
developed for texture and color texture segmentation.
The LBP operator was extended to use neighborhoods of different sizes Using a circular
neighborhood and bi linearly (Ocala ET al.2002). interpolating values at non-integer pixel
coordinates allow any radius and number of pixels in the neighborhood. The grayscale
variance of the local neighborhood can be used as the complementary contrast measure.
The LBP operator was extended to use neighborhoods of different sizes Using a circular
neighborhood and bi linearly (Ocala ET al.2002). interpolating values at non-integer pixel
coordinates allow any radius and number of pixels
In the following, the notation (P,R) will be used for pixel neighborhoods which means P
sampling points on a circle of radius of R. See Fig. 2 for an example of LBP computation.
1010
Another extension to the original operator is the definition of so called uniform patterns,
which can be used to reduce the length of the feature vector and implement a simple
rotation-invariant descriptor. This extension was inspired by the fact that some binary
patterns occur more commonly in texture images than others. A local binary pattern is
called uniform if the binary pattern contains at most two bitwise transitions from 0 to 1
orviceversawhenthebitpatternistraversedcircularly.Forexample,thepatterns00000000(0tra
nsitions),01110000(2transitions)and11001111(2transitions)are
Uniform whereas the patterns 11001001 (4 transitions) and 01010010 (6 transitions) are
not. In the computation of the LBP labels, uniform patterns are used so that there is a
separate label for each uniform pattern and all the non-uniform patterns are labeled with a
single label. For example, when using (8,R)neighborhood, there area total of
256patterns,58 of which are uniform, which yields in59 different labels.
Ocala et al. (2002) noticed in their experiments with texture images that uniform
patternsaccountforalittlelessthan90% of all patterns when using the (8, 1) neighborhood
and for around 70% in the (16, 2) neighborhood. Each bin (LBP code) can be regarded as
amicro-texton. Local primitives which are codified by these bins include different types
of curved edges, spots, flat areas etc.
ThefollowingnotationisusedfortheLBPoperator:LBPP,Ru2.Thesubscriptrepresentsusingtheo
peratorina(P,R)neighborhood.Superscriptu2standsforusingonlyuniformpatternsandlabelin
gallremainingpatternswithasinglelabel.AftertheLBPlabeledimage fl(x,y) has been
obtained, the LBP histogram can be defined as
Hi=∑x, yI {fl(x,y)=i},i=0,…,n−1,(1)
In which n is the number of different labels produced by the LBP operator, and
I{A}is1ifA is true and 0 if A is false.
Whentheimagepatcheswhosehistogramsaretobecomparedhavedifferentsizes, the
histograms must be normalized to get a coherent description:
Ni=Hi∑n−1j=0Hj.(2)
In the LBP approach for texture classification, the occurrences of the LBP codes in an
image are collected into a histogram. The classification is then performed by computing
simple histogram similarities. However, considering a similar approach for facial image
representation results in a loss of spatial information and therefore one should codify the
texture information while retaining also their locations. One way to achieve this goal is to
Use the LBP texture descriptors to build several local descriptions of the face and
Combine the min to a global description. Such local descriptions have been gaining
interest lately whichisunderstandablegiventhelimitationsoftheholisticrepresentations.
These local feature based methods are more robust against variations impose or
illumination than holistic methods.
Combine the min to a global description. Such local descriptions have been gaining
interest lately whichisunderstandablegiventhelimitationsoftheholisticrepresentations.
These local feature based methods are more robust against variations impose or
illumination than holistic methods.
The basic methodology for LBP based face description proposed by Alone et al. (2006) is
as follows: The facial image is divided into local regions and LBP texture descriptors are
extracted from each region independently. The descriptors are then concatenated to form
a global description of the face, as shown inFig.4.
This histogram effectively has a description of the face on three different levels of
locality: the LBP labels for the histogram contain information about the patterns on a
pixel-level, the labels are summed over a small region to produce information on a
regional level and the regional histograms are concatenated to build a global description
of the face.
The two-dimensional face description method has been extended into spatiotemporal
domain (Zhao and Pietikäinen 2007). Fig. 1 depicts facial expression description using
BP-TOP. Excellent facial expression recognition performance has been obtained with this
approach.
It should be noted that when using the histogram based methods the regions do not need
to be rectangular. Neither do they need to be of the same size or shape, nor they do
notnecessarilyhavetocoverthewholeimage.Itisalsopossibletohavepartiallyoverlappingregio
ns.
if employed for iris or Finger print spoofing and vice versa. Likewise, the performance of
face livener’s detectors drastically drops when they are presented with novel fabrication
materials (not used during the system design/training stage);
High error rates—none of the methods still have shown to reach a very low accept
Errors.
This histogram effectively has a description of the face on three different levels of
locality: the LBP labels for the histogram contain information about the patterns on a
pixel-level, the labels are summed over a small region to produce information on a
regional level and the regional histograms are concatenated to build a global description
of the face.
The two-dimensional face description method has been extended into spatiotemporal
domain (Zhao and Pietikäinen 2007). Fig. 1 depicts facial expression description using
BP-TOP. Excellent facial expression recognition performance has been obtained with this
approach.
CHARPTER-4
INTRODUCTION TO FACE ANTI SPOOFING
In electronics and signal processing, a Gaussian filter is a filter whose impulse response is
a Gaussian function (or an approximation to it). Gaussian filters have the properties of
having no overshoot to a step function input while minimizing the rise and fall time. This
behavior is closely connected to the fact the at the Gaussian filter has the minimum
Possible group delay. It is considered the ideal time domain filter, just as the since is the
ideal frequency domain filter.[1]These properties are important tin areas such as
oscilloscopes[2] and digital telecommunication systems.[3] Mathematically, a
GaussianfiltermodifiestheinputsignalbyconvolutionwithaGaussianfunction;thistransforma
tion is also known as the Weir stress transform. In two dimensions, it is the product of
two such Gaussians, one per direction:
The good performance of SIFT compared to other descriptors [8] is remarkable. Its
mixing of crudely localized information and the distribution of gradient related features
seems to yield good distinctive power while fending off the effects of localization errors
in terms of scale or space. Using relative strengths and orientations of gradients reduces
the effect of photometric changes. The proposed SURF descriptor is based on similar
properties, with a complexity stripped down even further. The first step consists of fixing
a reproducible orientation based on information from a circular region around the interest
point. Then, we construct a square region aligned to the selected orientation, and extract
the SURF descriptor from it. These two steps are now explained in turn. Furthermore, we
also propose an upright version of our descriptor (U-SURF) that is not invariant to image
rotation and therefore faster to compute and better suited for applications where the
camera remains more or less horizontal. The Speeded-Up Robust Features (SURF) [20]
is a fast and efficient scale and rotation invariant descriptor. It was originally proposed to
Where dx and dy are the Haar wavelet responses in the horizontal and vertical directions,
respectively. The feature vectors extracted from each sub-region are concatenated to
from a SURF descriptor with 64 dimensions:
SURF=[V1, ..., V16].(2)
The SURF descriptor was originally proposed for grayscale images. Inspired by our
previous finding [15],[2]showingtheimportanceofthecolortextureinfaceantispoofing, we
propose to extract the SURF features from the color images instead of thegray-
scalerepresentation.First,theSURFdescriptorisappliedoneachcolorbandseparately. Then,
the obtained features are concatenated to form a single feature vector (referred to as
CSURF). Finally, Principal Component Analysis (PCA) [2] is applied tode-
correlatetheobtainedfeaturevectorandreducethedimensionalityofthefacedescription.
Feature Extraction
Feature Description
Feature Extraction
The approach for interest point detection uses a very basic Hessian matrix approximation.
Integral images
The Integral Image or Summed-Area Table was introduced in 1984. The Integral Image
isusedasaquickandeffectivewayofcalculatingthesumofvalues(pixelvalues)inagivenimage
—
orarectangularsubsetofagrid(thegivenimage).Itcanalso,orismainly,usedforcalculatingtheav
erageintensitywithinagivenimage.
They allow for fast computation of box type convolution filters. The entry of an integral
image I_∑ (x) at a location x = (x, y) ᵀ represents the sum of all pixels in the input image
I within a rectangular region formed by the origin and x.
With Σ calculated, it
Only takes four additions to calculate the sum of the intensities over any upright,
rectangular area, independent of its size.
Surf uses the Hessian matrix because of its good performance in computation time and
accuracy. Rather than using a different measure for selecting the location and the scale
(Hessian-Laplace detector), surf relies on the determinant of the Hessian matrix for
both. Given a pixel, the Hessian of this pixel is something like:
For adapt to any scale, we filtered the image by a Gaussian kernel, so given a point = (x,
y), the Hessian matrix H (x, σ) in x at scale σ is defined as:
Where Lxx(x, σ) is the convolution of the Gaussian second order derivative with the
image I in point x, and similarly for Lxy (x, σ) and Lyy (x,σ). Gaussians are optimal for
scale-
Space analysis but in practice, they have to be discredited and cropped. This leads to a
loss in repeatability under image rotations around odd multiples of π /4. This weakness
holds for Hessian-based detectors in general. Nevertheless, the detectors still perform
well, and the slight decrease in performance does not outweigh the advantage of fast
convolutions brought by the discretization and cropping.
In order to calculate the determinant of the Hessian matrix, first we need to apply
convolution with Gaussian kernel, then second-order derivative. After Lowe‘s success
with Log approximations (SIFT), SURF pushes the approximation (both convolution and
second-order derivative) even further with box filters. These approximate second-order
Gaussian derivatives and can be evaluated at a very low computational cost using integral
Images and independently of size, and this is part of the reason why SURF is fast.
Due to the use of box filters and integral images, surf does not have to iteratively apply
the same filter to the output of a previously filtered layer but instead can apply such
filters of any size at exactly the same speed directly on the original image,
Space analysis but in practice, they have to be discredited and cropped. This leads to a
loss in repeatability under image rotations around odd multiples of π /4. This weakness
holds for Hessian-based detectors in general. These approximate second-order Gaussian
derivatives and can be evaluated at a very low computational cost to the output of a
previously filtered layer but instead can apply such filters of any size at exactly the same
speed.
Scale spaces are usually implemented as image pyramids. The images are repeatedly
smoothed with a Gaussian and subsequently sub-sampled in order to achieve a higher
level of the pyramid.
The 9 × 9 box filters in the above images are approximations for Gaussian second order
w=0.9(Bay’s suggestion)
Scale-space representation
Scale spaces are usually implemented as image pyramids. The images are repeatedly
smoothed with a Gaussian and subsequently sub-sampled in order to achieve a higher
level of the pyramid. Due to the use of box filters and integral images, surf does not have
to iteratively apply the same filter to the output of a previously filtered layer but instead
can apply such filters of any size at exactly the same speed directly on the original image,
And even in parallel. Therefore, the scale space is analyzed by up-scaling the filter size
(9×9 → 15×15 → 21×21 → 27×27, etc) rather than iteratively reducing the image size.
So for each new octave, the filter size increase is doubled simultaneously the sampling
intervals for the extraction of the interest points (σ) can be doubled as well which allow
the up-scaling of the filter at constant cost. In order to localize interest points in the image
and over scales, an on maximumsuppressionina3×3×3neighborhoodisapplied. Gaussian
and subsequently sub-sampled in order to achieve a higher level of the pyramid. Due to
the use of box filters and integral images, surf does not have to iteratively apply the same
filter to the output of a previously filtered layer but instead can apply such filters of any
size at exactly the same speed directly on the original image, Gaussian and subsequently
sub-sampled in order to achieve a higher level of the pyramid. Due to the use of box
filters and integral images, surf does not have to iteratively apply the same filter to the
output of a previously filtered layer but instead can apply such filters of any size at
exactly the same speed directly on the original image, the filter size increase is doubled
simultaneously the sampling intervals for the extraction of the interest points (σ) can be
doubled as well which allow the up-scaling of the filter at constant cost. In order to
localize interest points in the image. Instead of iteratively reducing the image size (left),
the use of integrally ages allows the up-scaling of the filter at constant cost (right).
Instead of iteratively reducing the image size (left), the use of integrally ages allows the
up-scaling of the filter at constant cost (right)
Instead of iteratively reducing the image size (left), the use of integrally ages allows the
up-scaling of the filter at constant cost (right).
Feature Description
ThecreationofSURFdescriptortakesplaceintwosteps.Thefirststepconsistsoffixingareproduc
ibleorientationbasedoninformationfromacircularregionaroundthe
Orientation Assignment
In order to be in variant or oration, surf tries to identify are producible orientation for the
interest points. For achieving this:
1. Surf first calculate the Haar-wavelet responses in x and y-direction, and this in a
circular neighborhood of radius 6s around the key point, with s the scale at which the
key point was detected. Also, the sampling step is scale dependent and chosen to be
s, and the wavelet responses are computed at that current scale s. Accordingly, at
highscalesthesizeofthewaveletsisbig.Thereforeintegralimagesareusedagainforfastfilte
ring.
Then the region is split up regularly into smaller 4 × 4 square sub-regions. For each
sub-region, we compute a few simple features at 5×5 regularly spaced sample points.
For reasons of simplicity, we call dx the Haar wavelet response in the
horizontaldirectionanddytheHaarwaveletresponseintheverticaldirection(filtersize2s).
Toincrease the robustness towards geometric deformations and localization errors,
the responses dx and dy are first weighted with a Gaussian (σ = 3.3s) centered at the
key point. Then the region is split up regularly into smaller 4 × 4 square sub-regions.
For each sub-region, we compute a few simple features at 5×5 regularly spaced
sample points. For reasons of simplicity, we call dx the Haar wavelet response in the
horizontaldirectionanddytheHaarwaveletresponseintheverticaldirection(filtersize2s).
Toincrease the robustness towards geometric deformations and localization errors,
the responses dx and dy are first weighted with a Gaussian (σ = 3.3s) centered at the
key point.
5 Then we calculate the sum of vertical and horizontal wavelet responses in a scanning
area, then change the scanning orientation (add π/3), and re-calculate, until we find
the orientation with largest sum value, this orientation is the main orientation of
feature descriptor.
1. The first step consists of constructing a square region centered around the key point
and oriented along the orientation we already got above. The size of this window
is20s.
2. Then the region is split up regularly into smaller 4 × 4 square sub-regions. For each
sub-region, we compute a few simple features at 5×5 regularly spaced sample points.
Then, the wavelet responses Dx and Dy are summed up over each sub region and form a
first set of entries to the feature vector. In order to bring in information about the polarity
of the intensity changes, we also extract the sum of the absolute values of the responses,
|dx| and |dy|. Hence, each sub-region has a four-dimensional descriptor vector v for its
underlying intensity structure V = (∑ dx, ∑ dy, ∑|dx|, ∑|dy|). This results in a descriptor
vector for all 4×4 sub-regions of length 64(In Sift, our descriptor is the 128-Dvector, so
this is part of there as on that SURF is faster than Sift).
Obtained by fitting a generative parametric model, e.g. Gaussian Mixture Model (GMM),
to the features to be encoded. Let X = {xt, t = 1... T} be D-dimensional local descriptors
extracted from a face Image I and let _ = {μk, _k, wk, k = 1... M} are the means, the
covariance matrices and the weights of the GMM model _ trained with a largest of local
descriptors. The derivations of the model _ with respect of the mean and the covariance
parameters (Equation3and4) capture the first and the second order differences between
the features X and each of the GMM components.
Where, t (k) is the soft assignment weight of the feature xt to the GMM component k:
Here, u I denote the probability density function of the Gaussian component I. The
concatenation of these two order differences [_1 1, ..., _1 M, _2 1, ..., _2M] represent the
Fisher Vector of the image I described by its local descriptors X. The dimensionality
ofthisvectoris2MD.AFishervectorrepresentshowthedistributionofthelocaldescriptorsXdiff
erfromthedistributionoftheGMMmodeltrainedwithallthetraining
Images. To further improve the performance, the Fisher vectors are normalized using a
square rooting followed by L2 normalization [7]. Figure depicts the general block
diagram of our face spoofing detection method.
For each model k, consider the mean and covariance deviation vectors
Where j=1, 2… Dj=1, 2,…, D spans the vector dimensions. The FV of image II is the
stacking of the vectors ukuk and then of the vectors vkvk for each of the KK modes in the
Gaussian mixtures:
1. Non-linear additive kernel. The Hellinger's kernel (or Bhattacharya coefficient) can
2. Be used instead of the linear one at no cost by signed squared rooting. This is
obtained by applying the function |z| sign z |z| sign z to each dimension of the vector
Φ(I)Φ(I). Other additive kernels can also be used at an increased space or time cost.
3. Normalization. Before using the representation in a linear model (e.g. a
supportvectormachine),thevectorΦ(I)Φ(I)isfurthernormalizedbythel2l2norm(note
What the standard Fisher vector is normalized by the number of encoded feature.
KIET-W ECE Page 36
Face anti spoofing using speeded up robust features encoding and fisher vector
After square-rooting and normalization, the IFV is often used in a linear classifier such as
an SVM.
Suppose the rearrested of data points that needs to be grouped in to several parts or
clusters based on the dissimilarity. In machine learning, this is known as Clustering.
• Hierarchical Clustering
• Gaussian Mixture Models
Principal components, thus reducing the number of variables significantly with minimal
loss of information.
This method focuses more on practical step-by-step PCA implementation on Image data
rather than a theoretical explanation as there are tons of materials already available for
that. The image data has been chosen over tabular data so that the reader can better
understand the working of PCA through image visualization. Technically, an image is
amatrixofpixelswhosebrightnessrepresentsthereflectanceofsurfacefeatureswithinthatpixel.
The standard context for PCA as an exploratory data analysis tool involves a dataset
with observations on p numerical variables, for each of n entities or individuals. These
data values define pn-dimensional vectors x1… xpor, equivalently, an n×p data matrix
X, whose j Th column is the vector xj of observations on the j Th variable. We seek a
linear combination of the columns of matrix X with maximum variance. Such linear
constantsa1,a2,…,ap.Thevarianceofanysuchlinearcombinationisgivenby
Var (Xa) = a′Sa, where S is the sample covariance matrix associated with the dataset and
′denotes transpose. Hence, identifying the linear combination with maximum variance is
equivalent to obtaining a p-dimensional vector at which maximizes the quadratic form
a′Sa. For this problem to have a well-defined solution, an additional restriction
mustbeimposedandthemostcommonrestrictioninvolvesworkingwithunit-normvectors,
I.e. requiring a ‘a=1. The problem is equivalent to maximizing a′Sa−λ (a′a−1), where λ is
a Lag range multiplier. Differentiating with respect to the vector a, and equating to the
null
Vector, produces the equation
Thus, a must be a (unit-norm) eigenvector, and λ the corresponding Eigen value, of the
covariance matrix S. In particular, we are interested in the largest Eigen value, λ1 (and
corresponding Eigen vector a1), since the Eigen values are the variances of the linear
combinations defined by the corresponding Eigen vector a: var (Xa) =
a′Sa=λa′a=λ.Equation (2.1) remainsvalidiftheeigenvectorsaremultipliedby−1, and so the
signs of
All loadings (and scores) are arbitrary and only their relative magnitudes and sign
patterns are meaningful.
Any p×p real symmetric matrix, such as a covariance matrix S, has exactly p real Eigen
values, λk (k=1… p), and their corresponding Eigen vectors can be defined to for man
orthonormal set of vectors, i.e. a′kak′=1if k=k′ and zero otherwise. A Lag range multipliers
approach, with the added restrictions of orthogonality of different coefficient vectors, can
also be used to show that the full set of Eigen vectors of S are the solutions to
It is these linear combinations Xak that are called the principal components of the dataset,
although some authors confusingly also use the term‗ principal components‘ when
referring to the Eigen vectors ak. In standard PCA terminology, the elements of the Eigen
vectors ak are commonly called the PC loadings, whereas the elements of the linear
Combinations Xak are called the PC scores, as they are the values that each individual
would score on a given PC.
It is common, in the standard approach, to define PCs as the linear combinations of
Solution (other than centering), since the covariance matrix of a set of centered or un
centered variables is the same, but it has the advantage of providing a direct connection
to an alternative, more geometric approach to PCA.
Denoting by X* the n×p matrix whose columns are the centered variables x*j, we have
Equation links up the Eigen decomposition of the covariance matrix S with the
singularvaluedecompositionofthecolumn-centreddatamatrixX*.AnyarbitrarymatrixYof
Where U, A are n×r and p×r matrices with orthonormal columns (U′U=Ir=A′A, with
Irthe r×r identity matrix) and L is an r×r diagonal matrix. The columns of an are called
the right singular vectors of Y and are the eigenvectors of the p×p
matrix Y′Y associated with its non-zero Eigen values. The columns of U are
Called the left singular vectors of Y and are the eigenvectors of the n×n matrix YY′
thatcorrespondtoitsnon-zeroeigenvalues.ThediagonalelementsofmatrixLarecalledthe
Singular values of Y and are the non-negative square roots of the (common) non-zero
Eigen values of both matrix Y′Y and matrix YY′. We assume that the diagonal elements
of L are in decreasing order, and this uniquely defines the order of the columns of U and
A (except for the case of equal singular values [4]). Hence, taking Y=X*, the
Right singular vectors of the column-centered data matrix X* are the vectors ak of PC
loadings. Due to the orthogonality of the columns of A, the columns of the matrix product
X*A=ULA′A=UL are the PCs of
X*.ThevariancesofthesePCsaregivenbythesquaresofthesingularvaluesofX*, divided by
n−1.Equivalently, and given (2.2)
And the above properties,
Where L2 is the diagonal matrix with the squared singular values (i.e. the Eigen values of
(n−1) S). Equation (2.4) gives the spectral decomposition, or Eigen decomposition, of
Matrix (n−1) S. Hence, PCA is equivalent to an SVD of the column-centered data matrix
X*.
The properties of an SVD imply interesting geometric interpretations of a PCA. Given
any rank r matrix Y of size n×p, the matrix Yq of the same size, but of rank q<r, whose
elements minimize the sum of squared differences with corresponding elements of Y is
given [7] by
Where Lq is the q×q diagonal matrix with the first (largest)q diagonal elements of L and
Uq, Aq are the n×q and p×q matrices obtained by retaining the q corresponding columns
in U and A.
In our context, the n rows of a ranker column-centered data matrix X*define a scatter plot
Of n points in an r-dimensional subspace of , with the origin as the centre of gravity
of the scatter plot. The above result implies that the ‗best’s-point approximation to this
scatter plot, in a q-dimensional subspace, is given by the rows of X*q, defined as in
equation (2.5), where best ‘means that the sum of squared distances between
KIET-W ECE Page 41
Face anti spoofing using speeded up robust features encoding and fisher vector
Variances of all p PCs. Hence, the standard measure of quality of a given PC is the
proportion of total variance that it accounts for,
Where to(S) denotes the trace of S. The incremental nature of PCs also means that we can
Speak of a proportion of total variance explained by a set of PCs (usually, but not
necessarily, the first q PCs), which is often expressed as a percentage of total variance
Accounted for: .
variance accounted for is a fundamental tool to assess the quality of these low-
dimensional graphical representations of the dataset. The emphasis in PCA is almost
always on the first few PCs, but there are circumstances in which the last few may be of
interest, such as in outlier detection [4] or some applications of image analysis.
PCscanalsobeintroducedastheoptimalsolutionstonumerousotherproblems.Optimality
criteria for PCA are discussed in detail in numerous sources among others. Mc Cube uses
some of these criteria to select optimal subsets of the original variables, which he calls
principal variables. This is a different, computationally more complex, problem
CHAPTER-5
SOFTWARE IMPLEMENTATION
5.1 IMAGE:
Each pixel has a color. The color is a 32-bit integer. The first eight bits determine
the redness of the pixel, the next eight bits the greenness, the next eight bits the blueness,
and the remaining eight bits the transparency of the pixel.
Fig5.3RGBRepresentation
Image file size is expressed as the number of bytes that increases with the number
of pixels composing an image, and the color depth of the pixels. The greater the number
of rows and columns, the greater the image resolution, and the larger the file. Also, each
pixel of an image increases in size when its color depth increases, an 8-bit pixel (1 byte)
stores 256 colors, a 24-bit pixel (3 bytes) stores 16 million colors, the latter known as true
color.
Image compression uses algorithms to decrease the size of a file. High resolution
cameras produce large image files, ranging from hundreds of kilobytes to megabytes, per
the camera's resolution and the image-storage format capacity. High resolution digital
cameras record 12megapixel (1MP=1,000,000pixels/1million) images, or more, in
True color. For example, an image recorded by a 12 MP camera; since each pixel uses
3bytes to record true color, the uncompressed image would occupy 36,000,000 bytes of
memory, a great amount of digital storage for one image, given that cameras must record
and store many images to be practical. Faced with large file sizes, both within the camera
and as to rage disc, image file formats were developed to store such large images.
Image file formats are standardized means of organizing and storing images. This
entry is about digital image formats used to store photographic and other images. Image
files are composed of either pixel or vector (geometric) data that are raster zed to pixels
when displayed (with few exceptions) in a vector graphic display. Including proprietary
types, there are hundreds of image file types. The PNG, JPEG, and GIF formats are most
often used to display images on the Internet.
EXIF:
The EXIF (Exchangeable image file format) format is a file standard similar to the
JFIF format with TIFF extensions. It is incorporated in the JPEG writing software used
in most cameras. Its purpose is to record and to standardize the exchange of
imageswithimagemetadatabetweendigitalcamerasandeditingandviewingsoftware.The
Meta data are recorded for individual images and include such things as camera settings,
time and date, shutter speed, exposure, image size, compression, name of camera, color
information, etc. When images are viewed or edited by image editing software, all of
this image information cans be displayed.
TIFF:TheTIFF(TaggedImageFileFormat)formatisaflexibleformatthatnormallysaves8
bitsor16bitspercolor(red,green,blue)for24-bitand48-bit
Totals, respectively, usually using either the TIFF or TIF filename extension. TIFFs
are loss and lossless. Some offer relatively good lossless compression for bi-level
(black & white) images. Some digital cameras can save in TIFF format, using the
LZW compression algorithm for lossless storage. TIFF image format is not widely
KIET-W ECE Page 47
Face anti spoofing using speeded up robust features encoding and fisher vector
The PNG (Portable Network Graphics) file format was created as the free, open-
source successor to the GIF. The PNG file format supports true color (16 million colors)
while the GIF supports only 256 colors. The PNG file excels when the image has large,
uniformly colored areas. The lossless PNG format is best suited for editing pictures, and
the loss formats, like JPG, are best for the final distribution of photographic images,
because JPG files are smaller than PNG files. PNG, an extensible file format for the
lossless, portable, well-compressed storage of raster images. PNG provides a patent-
freereplacementforGIFandcanalsoreplacemanycommonusesofTIFF.Indexed-color,
GIF: GIF (Graphics Interchange Format) is limited to an8-bitpalette, or 256 colors.
This makes the GIF format suitable for storing graphics with relatively few colors
such as simple diagrams, shapes, logos and cartoon style images. The GIF format
supports animation and is still widely used to provide image animation effects. It
alsousesalosslesscompressionthatismoreeffectivewhenlargeareashaveasinglecolor,
and ineffective for detailed images or dithered images.
BMP: The BMP file format (Windows bit map) handles graphics files within the
Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large.
The advantage is their simplicity and wide acceptance in Windows programs.
MATLAB is an interactive system whose basic data element is an array that does
not require dimensioning. This allows you to solve many technical computing problems,
especially those with matrix and vector formulations, in a fraction of the time it would
take to write a program in as caldron interactive language such as C or FORTRAN.
The name MATLAB stands for matrix laboratory. MATLAB was originally written to
provide easy access to matrix software developed by the LINPACK and EISPACK
projects. Today, MATLAB engines incorporate the LAPACK and BLAS libraries,
embedding the state of the art in software for matrix computation.
lboxes allow you to learn and apply specialized technology. Toolboxes are comprehensive
Development Environment:
This is the set of tools and facilities that help you use MATLAB functions and
Files. Many of these tools are graphical user interfaces. It includes the MATLAB
desktop and command window, a command history, an editor and debugger, and
browsers for viewing help, the workspace, files, and the search path.
transforms.
Graphics:
MATLAB has extensive facilities for displaying vectors and matrices as graphs,
as well as annotating and printing these graphs. It includes high-level functions for two-
dimensional and three-dimensional data visualization, image processing, animation,
andpresentationgraphics.Italsoincludeslow-
levelfunctionsthatallowyoutofullycustomizetheappearanceofgraphicsaswellastobuildcomp
letegraphicaluserinterfacesonyour MATLAB applications.
filethatcontainsacompletegraphicaldescriptionofallthefunction‘sGUIobjectsorelement
sandtheirspatial
Arrangement. A FIG-file contains binary data that does not need to be parsed when
he associated GUI-based M-function is executed.
A file with extension .m, called a GUI M-file, which contains the code that controls
the GUI operation. This file includes functions that are called when the GUI is
launched and exited, and callback functions that are executed when a user interacts
with GUI objects for example, when a button is pushed.
Fig5.4GUI
GUI components can include menus, toolbars, push buttons, radio buttons, list
boxes, and sliders just to name a few. GUIs created using MATLAB tools can also
perform any type of computation, read and write data files.
RESULT
The method solely based on single images of the face region exploit the fact that fake face images
captured from printed photos, video displays, and masks usually suffer from several issues related to
the spoofing medium.
FLOW CHART:
ALGORITHM:
ADVANTAGES:
Face anti-spoofing detection method based on depth information has obvious advantages:
the depth information has the characteristics of illumination invariance, so the robustness
of the face anti-spoofing detection is good.
Prevent static and dynamic 2D spoofs
Active and passive livens checks
APPLICATIONS:
Digital banking,
Identity validation sat ATM,
Forensic investigations,
Online assessments,
Retail crime,
School surveillance
Law,
Enforcement,
Casinos security.
CONCLUSION
CONCLUSION:
We proposed a face anti-spoofing scheme based on color SURF (CSURF) features and Fisher Vector
encoding. We extracted the SURF features om two different color spaces (HSV and YCbCr).Then, we
applied PCA and Fisher Vector encoding on the concatenated features. The proposed approach based
on fusing the features extracted from the HSV and YCbCr was able to perform very well on three
most challenging face spoofing datasets, outperforming state of the art results.
FUTURE SCOPE
FUTURE SCOPE:
The attendance management system can be designed and improved by adding the features that
indicate if the employee or student is late. Some more future enhancements for this are to
extend the current flash memory to store the complete details of the student. The system can
be enhanced to track the arrival and exit time of the student or employee for additional
monitoring.
SOURCE CODE
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Outputs from this function are returned to the command line.
function varargout = DeskGUI_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
% a = imread('icon\a.jpg');
% b=imresize (a, 0.4);
% set (handles.input, 'Data', b);
% a = imread('icon\e.jpg');
% b=imresize (a, 0.4);
% set (handles. Exit, 'Data', b);
% a = imread ('icon\h.jpg');
% b=imresize (a,0.4);
% set (handles. help, 'CData', b);
% a = imread ('icon\p.jpg');
% b=imresize (a, 0.2);
% set (handles. Process, 'CData', b);
%
% Get default command line output from handles structure
varargout{1} = handles.output;
REFERENCES
REFERENCES:
[1] Y. Li, K. Xu, Q. Yan, Y. Li, and R. H. Deng, ―Understanding OSN-based facial
disclosureagainstfaceauthenticationsystems,‖inProceedingsthe9thACMSymposium on
Information, Computer and Communications Security, ser. ASIA CCS‘14.ACM,2014,
pp. 413–424.
[2] A. Anjos, J. Komulainen, S. Marcel, A. Hadid, and M. Pietik¨ainen, ―Face anti-
spoofing: visual approach,‖ in Handbook of biometric anti-spoofing, S. Marcel, M.
S.Nixon,and S. Z.Li,Eds.Springer,2014,ch. 4,pp65–82.
[3] J.Galbally, S.Marcel, and J.Fi´errez,―Biometric anti spoofing methods: A survey in
face recognition,‖ IEEE Access, vol. 2,pp.1530–1552, 2014.
[4] A. Anjos and S. Marcel, ―Counter-measuresto photo attacks in face recognition: a
publicdatabaseandabaseline,‖inProceedingsofIAPRIEEEInternationalJointConferenceon
Biometrics (IJCB), 2011.
[5] T. de Freitas Pereira, J. Komulainen, A. Anjos, J. M. De Martino,A. Hadid,
M.Pietik¨ainen,andS.Marcel,―Facelivenessdetectionusingdynamictexture,‖EURASIP
Journalon Image and Video Processing,2013.
[6] S.Bharadwaj, T. I. Dhamecha,M.Vatsa, andS.Richa,―Computationallyefficient face
spoofing detection with motion magnification,‖ in Proceedings of IEEE
ConferenceonComputer Vision and Pattern Recognition,Workshop on Biometrics, 2013.
[7] S.Tirunagari,N.Poh,D. Wind ridge,A.Iorliam,N.Suki,and A.T.S.Ho,―Detection of face
spoofing using visual dynamics,‖IEEE Transactionson Information Forensics and
Security, vol. 10, no. 4, pp. 762–777 2015.
[13] T. de Freitas Pereira, A. Anjos, J. De Martino, and S. Marcel, ―Can face anti-
spoofing countermeasures work in a real world scenario?‖ in International Conference on
Biometrics(ICB), June2013, pp. 1–8.
[14] J. Yang, Z. Lei, and S. Z. Li, ―Learn convolution neural network for face anti-
spoofing,‖CoRR,vol.abs/1408.5601,2014.[Online].Available:http://arxiv.org/abs/1408.56
01
[15] Z.Boulkenafet, J.Komulainen, andA.Hadid, ―Faceanti-spoofingbasedoncolor texture
analysis, ‖ in IEEE International Conference on Image Processing (ICIP2015), 2015.
[23] K. Kollreider, H. Fronthaler and M.I. Faraj, "Real-time face detection and motion
analysis with application in “liveness” assessment", IEEE Transactions on Information
Forensics and Security,vol. 2,no.3-2,pp. 548-558,2007.
[24] K. Kollreider, H. Fronthaler and J. Bigun, "Non-intrusive liveness detection
byfaceimages",Image andVisionComputing,vol. 27,pp. 233-244,2009.
[25] A. Anjos and S. Marcel, "Counter-measures to photo attacks in face recognition:a
public database and a baseline", Proc. Proceedings of IAPR IEEE International
JointConferenceonBiometrics(IJCB),2011
[26] W. Bao, H. Li and N. Li, "A liveness detection method for face recognition based
on optical flow field", Proc. 2009 International Conference on Image Analysis and
Signal Processing, pp.233-236,2009.
[27] A. Lagorio, M. Tistarelli and M. Cadoni, "Liveness Detection based on 3D Face
Shape Analysis Biometrics and Forensics (IWBF)", 2013 International Workshop,
pp.1-4, 2013.
[28] T. Wang, J. Y and Z. Lei, "Face Liveness Detection Using 3D Structure
Recovered from a Single Camera”, International ConferenceonBiometrics, 2013.
[29] E.S. Ngand A.Y.S.Chia,"Faceverificationusing temporal affective cues", Proc.