Deepfake Detection A Systematic Literature Review

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Received January 25, 2022, accepted February 16, 2022, date of publication February 24, 2022, date of current

version March 10, 2022.


Digital Object Identifier 10.1109/ACCESS.2022.3154404

Deepfake Detection: A Systematic


Literature Review
MD SHOHEL RANA 1,2 , (Member, IEEE), MOHAMMAD NUR NOBI3 , (Member, IEEE),
BEDDHU MURALI2 , AND ANDREW H. SUNG2 , (Member, IEEE)
1 Department of Computer Science, Northern Kentucky University, Highland Heights, KY 41099, USA
2 Schoolof Computing Sciences and Computer Engineering, The University of Southern Mississippi, Hattiesburg, MS 39401, USA
3 Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX 78249, USA

Corresponding author: Md Shohel Rana ([email protected])


This work was supported in part by Northern Kentucky University and the University of Southern Mississippi.

ABSTRACT Over the last few decades, rapid progress in AI, machine learning, and deep learning has
resulted in new techniques and various tools for manipulating multimedia. Though the technology has been
mostly used in legitimate applications such as for entertainment and education, etc., malicious users have
also exploited them for unlawful or nefarious purposes. For example, high-quality and realistic fake videos,
images, or audios have been created to spread misinformation and propaganda, foment political discord
and hate, or even harass and blackmail people. The manipulated, high-quality and realistic videos have
become known recently as Deepfake. Various approaches have since been described in the literature to deal
with the problems raised by Deepfake. To provide an updated overview of the research works in Deepfake
detection, we conduct a systematic literature review (SLR) in this paper, summarizing 112 relevant articles
from 2018 to 2020 that presented a variety of methodologies. We analyze them by grouping them into four
different categories: deep learning-based techniques, classical machine learning-based methods, statistical
techniques, and blockchain-based techniques. We also evaluate the performance of the detection capability
of the various methods with respect to different datasets and conclude that the deep learning-based methods
outperform other methods in Deepfake detection.

INDEX TERMS Deepfake detection, video or image manipulation, digital media forensics, systematic
literature review.

I. INTRODUCTION face in videos using another person’s face and created


The notable advances in artificial neural network (ANN) photo-realistic fake videos. To generate such coun-terfeit
based technologies play an essential role in tampering with videos, two neural networks: (i) a generative network and
multimedia content. For example, AI-enabled software tools (ii) a discriminative network with a FaceSwap tech-
like FaceApp [1], and FakeApp [2] have been used for nique were used [3], [4]. The generative network creates
realistic-looking face swapping in images and videos. This fake images using an encoder and a decoder. The discrimina-
swapping mechanism allows anyone to alter the front look, tive network defines the authenticity of the newly generated
hairstyle, gender, age, and other personal attributes. The prop- images. The combination of these two networks is called
agation of these fake videos causes many anxieties and has Generative Adversarial Networks (GANs), proposed by
become famous under the hood, Deepfake. Ian Goodfellow [5].
The term ‘‘Deepfake’’ is derived from ‘‘Deep Learning Based on a yearly report [6] in Deepfake, DL researchers
(DL)’’ and ‘‘Fake,’’ and it describes specific photo-realistic made several related breakthroughs in generative model-
video or image contents created with DL’s support. This word ing. For example, computer vision researchers proposed a
was named after an anonymous Reddit user in late 2017, method known as Face2Face [7] for facial re-enactment.
who applied deep learning methods for replacing a person’s This method transfers facial expressions from one person
to a real digital ’avatar’ in real-time. In 2017, researchers
The associate editor coordinating the review of this manuscript and from UC Berkeley presented CycleGAN [8] to transform
approving it for publication was Zahid Akhtar . images and videos into different styles. Another group of

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
25494 VOLUME 10, 2022
M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

on surveying selected literature focusing on either detection


methods or performance analysis. However, a more com-
prehensive overview of this research area will be beneficial
in serving the community of researchers and practitioners
by providing summarized information about Deepfake in all
aspects, including available datasets, which are noticeably
missing in previous surveys. Toward that end, we present a
systematic literature review (SLR) on Deepfake detection in
FIGURE 1. Left: Google search engine finds web pages containing
‘‘Deepfake’’ keyword (web pages count vs. month). Right: Google search this paper. We aim to describe and analyze common grounds
engine finds web pages holding Deepfake related videos (web pages and the diversity of approaches in current practices on Deep-
count vs. month).
fake detection. Our contributions are summarized as follows.
• We perform a comprehensive survey on existing litera-
ture in the Deepfake domain. We report current tools,
scholars from the University of Washington proposed a techniques, and datasets for Deepfake detection-related
method to synchronize the lip movement in video with a research by posing some research questions.
speech from another source [9]. Finally, in November 2017, • We introduce a taxonomy that classifies Deepfake detec-
the term ‘‘Deepfake’’ emerged for sharing videos,in tion techniques in four categories with an overview of
which celebrities’ faces were swapped with the original different categories and related features, which is novel
ones. In January 2018, a Deepfake creation service was and the first of its kind.
launched by various websites based on some private spon- • We conduct an in-depth analysis of the primary studies’
sors. After a month, several websites, including Gfycat [10], experimental evidence. Also, we evaluate the perfor-
and Twitter, banned these services. However, mance of various Deepfake detection methods using
considering the threats and potential risks in privacy vul- different measurement metrics.
nerabilities, the study of Deepfake emerged super fast. • We highlight a few observations and deliver some guide-
Rossler et al. introduced a vast video dataset to train the lines on Deepfake detection that might help future
media forensic and Deepfake detection tools called research and practices in this spectrum.
FaceForen-sic [11] in March 2018. After a month, researchers The remainder of the paper is organized as follows:
at Stanford University published a method, ‘‘Deep video Section II presents the review procedure by defining interest
portraits’’ [12]that enables photo-realistic re-animation of research questions. In Section III, we thoroughly discuss
portrait videos. the findings from different studies. Section IV summarizes
UC Berkeley researchers developed another approach [13] the overall observations of the study, and we present the
for transferring a person’s body movements to another person challenges and limitations in Section V. Finally, Section VI
in the video. NVIDIA introduced a style-based generator concludes the paper.
architecture for GANs [14] for synthetic image generation.
According to [6] report, Google search engine could find II. PROCESS OF SLR
multiple web pages that contain Deepfake related videos There are two landmark literature surveys proposed by
(see Figure 1). We found the following additional information Budgen et al. [15] and Zlatko Stapić et al. [16] in the field
from this report [6]: of software engineering. We adopt their approaches in our
• The top 10 platforms posted 1,790+ Deepfake SLR and categorize the review process into three main stages
videos, without concerning,which has removed as shown in Figure 2 in order to identify, evaluate, and
’Deepfakes’ searches. understand various researches related to particular research
• Different pages post 6,174 Deepfake videos with fake questions.
video content. Planning the Review. The purposes of this stage are to
• 3 New platforms were devoted to distributing Deepfake (a) identify the need, (b) develop criteria and procedures, and
videos. (c) evaluate the criteria and procedures related to this SLR.
• In 2018, 902 articles were published in arXiv, including Conducting the Review. Based on the guiding princi-
the keyword GAN either in titles or abstracts. ples proposed in [17]–[19], this stage includes six obligatory
• 25 Papers published on this subject, including non-peer phases.
reviews, are investigated, and DARPA funded 12 of A. Research Questions (RQs): The purpose of the
them. RQ phase is to identify relevant studies that need to be
Apart from Deepfake videos , there are many other considered in the current review. We determine a set
malicious or illegal uses of Deepfake, such as spreading of RQs (described later) in the context of the Deepfake
misinformation, creating political instability, or various domain.
cybercrimes. To address such threats, the field of Deepfake B. Search strategy (SS): A predefined search strategy aims
detection has attracted considerable attention from academics to find as many as primary studies related to our research
and experts during the last few years, resulting in many questions. We try to establish an unbiased search
Deepfake detection techniques. There are also some efforts
25495
VOLUME 10, 2022
M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

researchers because the right question leads to raising con-


fidence in a domain [18]. Therefore, to recognize the recent
exercise in the field of Deepfake detection, we define four
such crucial questions (RQ 1-4) along with some supplemen-
tary questions (SRQs), shown in Table 1. As pointed out in the
table, we first identify the different categories of Deepfake
detection techniques. Next, we investigate the procedures
of related empirical experiments. Under the same research
question, RQ-2, we further deepen down by asking some sup-
plemental questions (SRQ-2.1 to SRQ-2.4) to follow internal
details that include:
• Describing datasets used to conduct experiments.
• Features that are commonly used by several methods.
• Models or algorithms used to detect Deepfake.
• Measurement metrics used to assess various method’s
performances in detecting such Deepfakes.
Then, we evaluate the overall performance of different
methods using various measurement metrics in RQ-3. Finally,
we compare models with respect to efficiency using the same
dataset and same measurement metric.

B. SEARCH STRATEGY (SS)


We intended to collect as many works as possible that are rel-
FIGURE 2. The process of the SLR.
evant to our research questions. During collecting Deepfake
detection studies, we tried to include all the combinations
strategy to detect as much of the relevant literature as of related search phrases or keywords to avoid any bias.
possible. The key idea of using Boolean terminology for combin-
C. Study Selection Criteria (SSC): There are challenges in ing those searching terms with ‘AND’ or ‘OR’. The search
the literature selection process, including the language words can be outlined primarily (Deepfake OR FaceSwap
of the study, knowledge of the authors, institutions, jour- OR Video manipulation OR Fake face/image/video) AND
nals, or year of publication, etc [17]. Before ascertaining (detection OR detect) OR (Facial Manipulation OR Digital
selection criteria, we follow careful consideration to Media Forensics). Instead of relying on one or two sources,
ensure fairness in selecting primary studies that provide we looked into several repositories to ensure a proper search.
significant evidence about research questions. However, there are many digital repositories are available for
D. Quality Assessment Criteria (QAC): The goal of assess- finding the research articles. We selected 10 popular reposito-
ing each primarily selected study’s quality is to ensure ries from them by considering their relevance and availability
that the study findings are relevant and unbiased. as listed below:
We develop a set of quality criteria for evaluating indi- • Web of Science
vidual studies. • IEEE Xplore Digital Library
E. Data extraction and monitoring (DEM): We carefully • ACM Digital Library
determine how the information required from selected • ScienceDirect (ELSEVIER)
studies would be obtained and record their pieces of • SpringerLink
evidence. • Google Scholar
F. Data Synthesis (DS): Data synthesis aims to organize • Semantic Scholar
and summarize outcomes obtained from the selected • Cornell University
studies. We follow a set of procedures to synthesize • Computing Research Repository
information better. • Database Systems and Logic Programming (DBLP)
Reporting the Review. After completing the review of all The repositories include journals, conferences, and
the studies, we report the outcomes in a suitable form to the archives. We limit our search duration from January 2018 to
distribution channel and target audience. December 2020.

A. RESEARCH QUESTIONS (RQs) C. STUDY SELECTION CRITERIA (SSC)


Choosing research questions (RQs) is the first step in defining We establish three inclusion criteria in our search procedure
a particular study’s overall purpose and expected outcomes. in order to select the relevant articles while searching in these
As such, we establish our RQs to make them meaningful to 10 digital repositories.

25496 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 1. Define research questions for the SLR.

• As our primary objective of this SLR is to study image or


video manipulation, we omitted audio and text manipu-
lation analysis.
• We filter out the research that focuses on specific trans-
formation techniques in Deepfake detection.

D. QUALITY ASSESSMENT CRITERIA (QAC)


Assessing the quality of evidence contained within an SLR
is as important as analyzing the data within. Results from
a poorly conducted study can be skewed by biases from
the research methodology and should be interpreted with
caution. Such studies should be acknowledged as such in the
systematic review or outright excluded. Selecting appropriate
criterion to help analyze strength of evidence and imbed-
ded biases within each paper is also essential. Based on
the criterion defined in [20] we validate the selected studies
using these criterion and review these studies by applying
the requirements. Also, a cross-checking approach has been
FIGURE 3. Study selection process. used for assessing these selected studies to ensure consistency
among different findings. After this quality assessment phase,
we finalized 91 research articles and 21 additional reviews
• The search phrases are part of the title or abstract, or (7 SLRs, 10 analyses, and 4 surveys) representing Deepfake
keywords. detection.
• Some works mainly dealt with Deepfake without men-
tioning related keywords in the title, abstract, or key- E. DATA EXTRACTION AND MONITORING (DEM)
words. In such a case, we look for the desired keywords This phase describes designing systems for the actual extrac-
in other parts of the literature. We include those works tion of data from the studies. To find possibly relevant articles,
if we find any. we thoroughly searched nine popular libraries (see Figure 3).
• Empirical evidence is explicitly presented in writing. We chose studies that matched the following requirements:
Besides, a series of exclusion criteria are also established 1) The methods or results section stated what entities were or
to skip studies that may not be relevant from this review needed to be extracted, and 2) at least one entity was auto-
(see Figure 3): matically extracted, with assessment findings for this kind of
• Studies that are not written in English. entity given. The answers to the RQs are determined based
• A few pieces of research are published concur- on the knowledge gained from the data extraction process
rently in conferences and journals. In such a case, (see Figure 4).
we considered the most comprehensive one to avoid • Author(s), publication sources, and publication times:
duplicates. In this part, we obtain the author’s information,

VOLUME 10, 2022 25497


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

FIGURE 4. The information of the extracted data. FIGURE 5. Distribution of studies (half-yearly).

TABLE 2. Source of publications.


publication period, and the origin of publications: con-
ference, workshop, or journal.
• Analysis techniques: Based on this study, we identified
the various methods based on the feature analysis that
are applied for detecting Deepfakes.
• Empirical evidence: In this part, we focused on the
following four components: (i) datasets that are used
by the study authors, (ii) features used for analysis,
(iii) models or methodologies applied, and (iv) measure-
ment metrics that are used by study authors to evaluate
their results.

F. DATA SYNTHESIS (DS)


The data synthesis phase specifically reviews the associated
and comparative findings from the data extraction process,
which can be presented as indications to support definitive
responses to the RQs. After accumulating the data, we ana- six months. This rising trend continues over the year, with
lyze them for further information extraction, and visualize almost 1.5 times more publications in the last part of the
the collected data through various data visualization tools and year than in the first half, indicating the research thirst in the
techniques, such as histograms, pie maps, tables, etc. Deepfake sphere.

III. OUTCOMES 2) SOURCE OF PUBLICATIONS


A. DESCRIPTION OF STUDIES We mainly consider eight different publication sources from
We accumulate a total of 112 studies from our determined recognized conferences, workshops, journals, and archives.
sources within three years of the publication period. We observe that more articles were published as archived
papers in the domain of Deepfake, whereas a few papers were
1) PUBLICATION PERIOD issued in the journal. We present source wise publications
The Deepfake related research primarily emerged in 2018. number in the Table 2. We didn’t include the source in the
Therefore, we considered the publication period from the table if the publication count is below two.
beginning of 2018 until 2020. As presented in Figure 5,
over the span, the number of publications increased expo- B. RQ-1: WHAT ARE THE POPULAR DEEPFAKE
nentially. In the figure, we report half-yearly publications DETECTION TECHNIQUES?
to count. As shown, there are only three publications in the As discussed in Section II-A, we explore the overall sur-
first half of 2018, which becomes double in the second half. vey in the form of some research questions. As part of
A similar trend continued in 2019. However, this trend broke the discussion, we first determine the Deepfake detection
in 2020, with a surge of 32 publications in only the first techniques widely used in the literature. Though Deepfake

25498 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

mainly manipulates images or video using deep learning (DL) applied to detect specific artifacts generated by their gener-
based technique, other methods along with DL obtain Deep- ation pipeline. Zhang et al. [33] introduced a GAN simula-
fake. We categorize different researches according to the tor that replicates collective GAN-image artifacts and feeds
applied techniques and describe them in the following them as input to a classifier to identify them as Deepfake.
sections. Zhou et al. [34] proposed a network for extracting the stan-
dard features from RGB data, while [35] proposed a simi-
1) MACHINE LEARNING BASED METHODS lar but generic resolution. Besides, in [36]–[38], researchers
Traditional machine learning (ML) algorithms are instrumen- proposed a new detection framework based on physiological
tal in comprehending the logic for any decision that could be measurement, for example, Heartbeat.
expressed in human terms. Such methods are suitable for the At first, the deep learning-based method was proposed
Deepfake domain as there is a better grasp of the data and in [40] for Deepfake video detection. Two inception modules,
processes. In addition, tuning hyper-parameters and changing (i) Meso-4 and (ii) MesoInception-4, were used to build
model designs are much more manageable. The tree-based their proposed network. In this technique, the mean squared
ML approaches, for example, Decision Tree, Random Forest, error (MSE) between the actual and expected labels is used
Extremely Randomized Trees, etc., show the decision process as the loss function for training. An enhancement of Meso-4
in the form of a tree. Therefore, a tree-based method does not has been proposed in [41].
have any explainability issues. In a supervised scenario, the authors in [42] shows that
GANs are used to automatically train a generative model the deep CNNs [43]–[45] outperform shallow CNNs. Some
by treating the unsupervised issue as supervised and cre- methods apply techniques for extracting the handcrafted fea-
ating photo-realistic fake faces in images or videos. Some tures [46]–[47], spatiotemporal features [48]–[51], common
ML-based methods aspire to show certain irregularities found textures [52], [53], 68 face landmarks [54]–[56] with visual
in such GANs generated fake videos or images. artifacts (i.e., eye, teeth, lip movement, etc.) from the video
A very fundamental approach of Deepfake is to manipulate frames. Such features were used as input to the these networks
the human face to confuse its audiences. There are different for detecting Deepfake manipulations. Besides data augmen-
approaches to do that. However, to fool the users, most tation [57], super-resolution reconstruction [58], localization
techniques modify certain regions of the face, such as shade strategies in pixel levels [11] are formulated on the entire
of the eyes, ear with a ring, etc. Such methods using a single frame, and maximum mean discrepancy (MMD) loss [59] is
part (a.k.a. feature) are limited to identifying or detecting applied to discover a more general feature.
the manipulated area. To overcome these, the authors in [21] Further innovations are achieved by introducing an atten-
proposed a Deepfake technique by combining a set of such tion mechanism [61] while promising outcomes are shown
features. in [62]–[63] by using an architecture named capsule-network
In [22], the consistency of the biological signs are mea- (CN). The CN needs a smaller number of parameters to
sured along with the spatial and temporal [23]–[25] directions train than very deep networks. An ensemble learning tech-
to use various landmark [26] points of the face (e.g., eyes, nique [64]–[65] is applied to increase such structures’ perfor-
nose, mouth, etc.) as unique features for authenticating the mance, which achieves more than 99% accuracy.
legitimacy of GANs generated videos or images. Similar We observe that many approaches were proposed to apply
characteristics are also visible in Deepfake videos, which frame-by-frame analysis in videos or images to manipulate
can be discovered by approximating the 3D head pose [27]. face and track facial movement to obtain better performance.
In most cases, facial expressions are associated initially with For example, in [66]–[71], RNN based networks are proposed
the head’s movements. Habeeba et al. [88] applied MLP to to extract the features at various micro and macroscopic
detect Deepfake video with very little computing power by levels for detecting Deepfake. Regardless of these exciting
exploiting visual artifacts in the face region. results in detection, it is seen that most of the methods lean
As far as the performance concern in machine learning- towards overfitting. The optical fow based technique [72] and
based Deepfake methods, it is observed that these approaches autoencoder-based architectures [73]–[76] are introduced to
can achieve up to 98% accuracy in detecting Deepfakes. How- resolve such problems. A pixel-wise mask [77] is imposed
ever, the performance entirely relies on the type of dataset, the on various models to get the essential depiction of the face’s
selected features, and the alignment between the train and test affected area. Fernando et al. [78] applied adversarial train-
sets. The study can obtain a higher result when the experiment ing approaches followed by attention-based mechanisms for
uses a similar dataset by splitting it into a certain level of ratio, concealed facial manipulations. In [93], researchers pro-
for example, 80% for a train set and 20% for a test set. The posed a clustering technique by integrating a margin-based
unrelated dataset drops the performance close to 50%, which triplet embedding regularization term in their classification
is an arbitrary assumption. loss function. Finally, they converted the three-class classi-
fication problem to a two-class classification problem. The
2) DEEP LEARNING BASED METHODS authors in [94]–[95] proposed a data pre-processing tech-
In the case of Deepfake detection in images, there are nique for detecting Deepfakes by applying CNN methods.
plenty of works where deep learning-based methods are The researchers in [96] proposed patch and pair convolutional

VOLUME 10, 2022 25499


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

neural networks (PPCNN). In [97], authors performed distance increases iff the GAN provides a lesser amount of
an analysis in the frequency domain by exploiting the correctness. Besides, an extremely precise GAN is mandatory
image latent patterns’ richness. A modern approach called to create high-resolution manipulated images that are harder
ID-revelation [98] was proposed to learn temporal facial fea- to detect.
tures based on a person’s movement during talking. A novel
feature extraction method [99] had been proposed for effec- 4) BLOCKCHAIN BASED METHODS
tively classifying Deepfake images. In [100], a multimodal Blockchain technology provides various features that can
approach was proposed for detecting real and Deepfake verify the legitimacy and provenance of digital content in a
videos. This method extracts and analyzes the similarities highly trusted, secured, and decentralized manner. In public
between the audio and visual modalities within the same Blockchain technology, anyone has direct access to every
video. In [101], a Deepfake detection method is applied to transaction, log, and tamper-proof record. For Deepfake
find the discrepancies between faces and their context by detection, public Blockchain is considered one of the most
combining multiple XeptionNet models. appropriate technological solutions for verifying video’s or
In [101], a separable convolutional network is used for image’s genuineness in a decentralized way. Users usually
detecting such manipulations. [103] resorts to the feature need to explore the origin of videos or images when they are
extraction process’s triplet loss function to better classify marked as suspected.
fake faces. A patch-based classifier was introduced in [104] Hasan and Salah [113] proposed a Blockchain-based
to focus on local patches rather than the global structure. generic framework to track suspected video’s origin to their
In [105]–[106], the authors extracted features using improved sources. The proposed solution can trace its transaction
VGG networks. A hypothesis test was performed in [107]. records, even though the material is copied several times.
The basic principle says that digital content is considered
3) STATISTICAL MEASUREMENTS BASED METHODS authentic when convincingly traced to a reliable source. For
Determining different statistical measures such as average Deepfakes, public Blockchain verifies video content’s legit-
normalized cross-correlation scores between original and imacy in a decentralized way, as the technology can provide
suspected data helps to understand the originality of the some critical features to prove its authenticity. The following
data. Koopman et al. [108] examined the photo response non- are the main contributions of [113].
uniformity (PRNU) for detecting Deepfakes in video frames. • Presents a generic framework based on Blockchain tech-
PRNU is a unique noise pattern in the digital images that nology by setting up a proof of digital content’s authen-
occurred due to the defects in the camera’s light-sensitive ticity to its trusted source.
sensors. Because of its distinctiveness, it is also considered • Presents the proposed solution’s architecture and design
the fingerprint of digital photos. The research generates a details to control and administrate the interactions and
sequence of frames from input videos and stores them in transactions among participants.
chronologically categorized directories. Each video frame is • Integrates the critical features of IPFS [114]-based
clipped with the same pixel range to preserve and clarify decentralized storage ability to Blockchain-based
the portion of the PRNU sequence. These frames are then Ethereum Name service.
divided into eight equal groups. It then makes the standard
Chan et al. [115] proposed a decentralized approach based
PRNU pattern for each frame using the second-order FSTV
on Blockchain to trace and track digital content’s histori-
method [147]. After that, it correlates them by measuring the
cal provenance (i.e., image, videos, etc.). In this proposed
normalized cross-correlation scores and calculating the dif-
approach, multiple LSTM networks are being used as a deep
ferences between the correlation scores and the mean correla-
encoder for creating discriminating features, which are then
tion score for each frame. To evaluate statistical significance
compressed and used to hash the transaction. The main con-
between Deepfakes and original videos, the authors conduct
tributions of this paper are as follows.
a t-test [109] on the results.
To model a basic generating convolutional structure, the • Using multiple LSTM CNN models, image/video con-
authors in [110] extracted a collection of regional features tents are hashed and encoded.
using the Expectation-Maximization (EM) algorithm. After • High dimensional features are preserved as a binary
the extraction, they apply ad-hoc validation to those archi- coded structure.
tectures, such as GDWCT, STARGAN, ATTGAN, STYLE- • The information is stored in a permission-based
GAN, and STYLEGAN2, using preliminary experiments Blockchain, which gives the owner control over its
naive classifiers. Agarwal et al. [111] performed a hypothesis contents.
test by proposing a statistical framework [112] for detecting Based on the studies, taking together all these methods,
the Deepfakes. Firstly, this method defines the shortest path Table 3 lists the categories of Deepfake detection strategies
between distributions of original and GAN-created images. and displays the quantity (No.) and percentage (PCT) of
Based on the results of this hypothesis, this distance measures related categories of studies. This table includes 91 stud-
the detection capability. For example, Deepfakes can easily ies, except 21 different reviews ([60], [116]–[135]) which
be detected when this distance is increased. Usually, the merge various methods. Also, this table reveals that the

25500 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 3. Classification of Deepfake detection methods. TABLE 4. The list of Deepfake datasets.

age, gender, emotions, etc., using facial expressions fall


deep learning-based approach is the most widely used under this stage.
technique, accounting for around 77% in all studies. The • Feature extraction: Extracting various features from the
research relating to machine learning approaches and statis- face area as candidate features for the detector.
tical methods is 18% and 3%, respectively. The number of • Feature selection: Select from the extracted features
studies in this analysis on the Blockchain-based approach those that are most useful for Deepfake detection.
is 2%. Overall, we divide the Deepfake detection tech- • Model selection: Finding a suitable model from a pool
niques into four categories: deep learning-based methods, of available models for classification. These models
machine learning-based techniques, statistical-based tech- include deep learning-based models, machine learning-
niques, and Blockchain-based techniques. Among them, deep based models, and statistical models.
learning-based methods are used broadly for detecting such • Model evaluation: Finally, evaluating the performance
Deepfakes. of the selected models using various measurement
metrics.
C. RQ-2: WHAT IS THE WAY TO PERFORM EMPIRICAL The following sub-sections describe the datasets used in
TESTS TO DETECT DEEPFAKE USING THESE STUDIES? several experiments, the frequently utilized features, models
To provide an answer to RQ-2, we review the different experi- used for detection tasks, and measurement metrics used to
mental methods in-depth and categorize the overall Deepfake evaluate models’ performance in detecting Deepfakes.
detection process into six distinct stages (see Figure 6) that
are summarized below. 1) SRQ-2.1: WHAT DATASETS ARE TYPICALLY USED IN
DEEPFAKE DETECTION EXPERIMENTS?
We found various Deepfake datasets used in numerous studies
for training and testing purposes. In turn, these datasets have
enabled incredible advances in Deepfake detection. Most
of the real videos in these datasets are filmed with a few
volunteer actors in limited scenes. The fake videos are crafted
by researchers using a few popular Deepfake software.
Figure 7 displays various datasets that are used in
different studies. From this figure, it is observed that
FaceForensics++, Celeb-DF, and DFDC are quite popular
FIGURE 6. Steps of Deepfake detection.
and were used in plenty of studies. Table 4 describes a sum-
mary of these datasets.
• Data collection: Collecting and organizing unadulter-
ated original and Deepfaked data (images or videos) is 2) SRQ-2.2: WHAT FEATURES ARE TYPICALLY UTILIZED IN
done in this initial phase. DETECTING DEEPFAKE?
• Face detection: Identifying which parts of an image or Based on the categories of Deepfake detection and
video need to be focused on to reveal characteristics like analysis techniques described in RQ-1, 21 studies use

VOLUME 10, 2022 25501


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 5. Distribution of used models.

FIGURE 7. List of datasets used in Deepfake related studies. Convolutional Neural Network (RCNN) model, Faster
RCNN model, Hierarchical Memory Network (HMN)
model, Multi-task Cascaded CNNs MTCNN) model and
special artifacts-based features generated by various edit- Deep Ensemble Learning (DEL).
ing processes. Among them, 20 studies use texture and • Machine learning model: This technique creates a fea-
Spatio-temporal consistent features, 14 studies involve facial ture vector by defining the right features using var-
landmarks-based features. Also, 13 research papers perform ious state-of-art feature selection algorithms. It then
experiments using visual artifacts-based elements, for exam- feeds this vector as input to train a classifier to classify
ple, eye blinking, head posing, lip movement, etc. Eight whether the videos or images are manipulated by Deep-
pieces of work apply biological characteristics, whereas fake or not. Support Vector Machine (SVM), Logistic
seven studies concern intra-frame inconsistencies with fre- Regression (LR), Multilayer Perceptron (MLP), Adap-
quency domain analysis. In addition, six studies use GAN- tive Boosting (AdaBoost), eXtreme Gradient Boosting
based features, and four studies cover latent space-based (XGBoost), and K-Means clustering (k-MN), Random
features. Ten studies use custom features utilizing various Forest (RF), Decision Tree (DT), Discriminant Analysis
analyses that include error level analysis, mesoscopic anal- (DA), Naive Bayes (NB) and Multiple Instance Learn-
ysis, steganalysis, super-resolution, augmentation, maximum ing (MIL) are used as machine learning-based models.
mean discrepancy, PRNU pattern analysis, etc. The details are • Statistical Model; The statistical models are based on
described in the Result section using RQ-1. The study shows the use of the information-theoretic study for valida-
that special artifacts-based features, face landmarks, and tion. In these models, the shortest paths are calculated
Spatio-temporal features are used widely to detect Deepfakes. between original and Deepfake videos/images distribu-
tions. For example, in [108], a significance is measured
3) SRQ-2.3: WHAT MODELS ARE USED TO DETECT for mean normalized cross-correlation scores between
DEEPFAKE MANIPULATION? the original and the Deepfake videos, classifying them
This segment describes various models that are used as fake or real. The often-applied statistical models are
for detecting Deepfake. Based on this study, we divide Expectation-Maximization (EM), Total Variational (TV)
these models into three groups: (i) deep learning model, distance, Kullback-Leibler (KL) divergence, Jensen-
(ii) machine learning model, and (iii) statistical model. Shannon (JS) divergence, etc.
• Deep Learning models: In computer vision, deep Based on these studies, we conduct a categorization in the
learning models have been used widely due to deep learning models, machine learning models, and statis-
their feature extraction and selection mechanism, tical methods, as shown in Table 5. The table outlines the
as they can directly extract or learn features from number and the percentage of models used in the studies,
the data. In Deepfake detection studies, we found except for 21 different reviews. Also, we observe that the
the following deep learning-based models have been DL-based studies hold the highest proportion of SLR.
used: convolutional neural network (CNN) model Figure 8 displays the full versions of detector groups that
(e.g., XceptionNet, GoogleNet, VGG, ResNet, Efficent- are found from these primary studies, where CNN has the
Net, HRNet, InceptionResNetV2, MobileNet, Incep- most divisions. Based on this Table 5, we further apply a
tionV3, DenseNet, SuppressNet, StatsNet), Recurrent subcategorization on CNN models and found that the fol-
Neural Network (RNN) model (e.g., LSTM, FaceNet), lowing 3 CNN models: (i) XeptionNet and ResNet take 17%
Bidirectional RNN model, Long-term Recurrent and (iii) VGG with 12%, respectively. Besides, LSTM models

25502 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 6. Summary of works towards Deepfake detection.

VOLUME 10, 2022 25503


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 6. (Continued.) Summary of works towards Deepfake detection.

take 13% of RNN. In addition to this, the most popular Convolutional Neural Network, MTCNN: Multi-task Cas-
machine learning model is SVM with 12% and k-MN with caded CNN, MSCNN: multi-scale Temporal CNN), ML:
4%. The detail distribution in various models is presented in (SVM: Support Vector Machine, RF: Random Forest, MLP:
Figure 9 that shows the proportion of used models (e.g., DL, Multilayer Perceptron Neural Network, LR: Logistic Regres-
ML, Statistical) in various studies for detecting Deepfake. sion, k-MN: K means clustering, XGB: XGBoost, ADB:
Besides, it provides the answer for SRQ-2.3. The reviewed AdaBoost, DT: Decision Tree, NB: Naive Bayes, KNN:
papers show that the deep neural network (DNN) models K-Nearest Neighbour, DA: Discriminant Analysis), STAT:
are successful in Deepfake detection, where CNN-based (EM: Expectation Maximization, CRA: Co-relation Analy-
models demonstrate more efficiency among all the sis), BC: (ETH: Ethereum Blockchain)), Features
DNN models. (SA: Special Artifacts, VA: Visual Artifacts, BA: Biolog-
At a glance. Focus indicates the clue for the detection ical Artifacts, FL: Face Landmarks, STC: Spatio-temporal
(DMF: Digital Media Forensics, FM: Face Manipulation, Consistency, TEX: Texture, FDA: Frequency Domain Anal-
Both: DMF and FM), Methods indicates method cate- ysis, LS: Latent Feature, GAN: Generative Adversarial
gory (ML: Machine Learning, DL: Deep Learning, STAT: Network based feature, MES: Mesoscopic features, IFIC:
Statistical method, BC: Blockchain), Models represents Intra-frame inconsistency, CPRNU: Constrastive and Pho-
types of model (DL: (CNN: Convolutional Neural Net- toresponsive PRNU pattern, IMG: Image Metadata, Aug-
work, RNN: Recurrent Neural Network, RCNN: Regional mentation & Steganalysis, Other: Different feature not in

25504 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

TABLE 7. Confusion matrix. approaches concerning different elements such as input data,
features, method categories, and type of techniques. A path
between two elements denotes the related components used
in the companion paper for any method. As presented in
the Figure, most papers apply image or video as the input
data, whereas many papers use both image and video as the
input. Special Artifacts and Texture and Spatio-temporal Con-
sistency are the commonly used features in various papers.
the common list), Datasets (FF: FaceForensics, FF++:
About 75% of the methods used the DL-based techniques
FaceForensics++, DFD: Deepfake Detection, CELEB-A:
as the detection method category. Only a few papers used
DeepFake Forensics V1, CELEB-DF: DeepFake Forensics
Blockchain and Statistical approaches for detecting such
V2, DFDC: Deepfake Detection Challange, DF-TIMIT:
Deepfake.
Deepfake-TIMIT, DF-1.0: DeeperForensics-1.0, WDF: Wild
In detecting Deepfake, various underlying techniques are
Deepfake, SMFW: SwapMe and FaceSwap, DFS: Deep
available, such as Biological Signals, Phoneme-Viseme Mis-
Fakes, FFD: Fake Faces in the Wild, FE: FakeET, FS:
matches, facial expression and movements (i.e., 2D and
Face Shifter, DF: Deepfake, SFD: Swapped Face Detection,
3D facial landmark positions, head pose, and facial action
UADFV: Inconsistent Head Poses, MANFA: Tampered Face,
units), etc. We combine them under two central umbrel-
Other: Authors’ Custom datasets).
las of the methods that include Facial Manipulation and
Finally, we summarize all at a glance using Table 6 that
Digital Media Forensics. As shown in Figure 10, most
specifies the features, methods and models, datasets used
of the DL-based methods exploit Facial Manipulation for
throughout the studies and also focuses on specific manip-
the Deepfake detection. However, Machine Learning based
ulation detection techniques with having a reference to each
methods almost equally utilize both techniques. Common
of the primary studies.
to both Blockchain and Statistical approaches, they apply
only Digital Media Forensics as part of the detection
4) SRQ-2.4: WHAT MEASUREMENT METRICS ARE USED technique.
FOR COMPUTING THE PERFORMANCE OF DEEPFAKE
DETECTION METHODS?
This section briefly describes various measurement metrics E. RQ-4: WHAT IS THE GENERAL EFFICIENCY OF
applied for assessing the models’ performance in detect- A VARIETY OF DEEPFAKE DETECTION STRATEGIES
ing such Deepfakes. A confusion matrix holds info about BASED ON EXPERIMENTAL PROOF?
actual and predicted classification results. The accounts of This segment attempts to decide the efficacy of Deepfake
the detection capabilities of the used methods are measured detection methods. The output assessment values are first
and confirmed using this matrix data. Table 7 describes the obtained and stored in an Excel document based on the
confusion matrix. studies. After that, we count the number of studies that use the
Using Table 7, we can define the term, TP, which pro- same method and the same measurement metrics (precision,
vides the number of Deepfakes that are correctly predicted as accuracy, and recall). And finally, we apply four metrics: the
Deepfake, and TN offers the number of actual images/videos minimum, maximum, mean, and standard deviation on these
correctly predicted as real. Besides, FP stands for the num- values (see Table 9).
ber of real images/videos incorrectly predicted as Deepfake, In Table 9, based on the mean values of accuracy and
where FN is the number of Deepfakes incorrectly predicted AUC, deep learning-based methods outperform other meth-
as the real. Similarly, using Table 8, we can define various ods and achieve 89.73% and 0.917, respectively. Besides,
measurement metrics and show how many studies are related we also compare the recall and precision values for
to these metrics. both techniques. Based on the overall results, we found
Based on the Table 8, it is seen that the often-applied deep learning-based techniques are efficient for detecting
measurement metrics are Accuracy (AC), receiver operating Deepfake.
characteristic (ROC) curve, and area under the ROC curve
(AUC). Recall, error rate (ER), precision (P), f1-score, and F. RQ-5: IS THE EFFICIENCY OF DEEP LEARNING MODELS
log loss occupy a similar proportion. The least used perfor- BETTER THAN NON-DEEP LEARNING MODELS
mance measure is frechet-inception-distance (FID). Based on IN DEEPFAKE DETECTION BASED
the study, accuracy and AUC are widely used measurement ON EXPERIMENTAL RESULTS?
metrics in detecting Deepfake. We split the models into two groups: (i) deep learning-based
models and (ii) non-deep learning-based models. We deter-
D. RQ-3: WHAT IS THE CLASSIFICATION FRAMEWORK mine the mean accuracy, AUC, recall, and precision.
FOR DEEPFAKE DETECTION APPROACHES? Next, we apply a comparative analysis of these two
For better insights, we summarize our key findings models’ performance and obtain an average result. Based
in Figure 10. As demonstrated in Figure, we classify overall on the evaluation of these models using performance

VOLUME 10, 2022 25505


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

FIGURE 8. The list of Deepfake detection models.

measures (accuracy, AUC, recall, and precision), we observe, especially the CNN models, to learn how to mechanically or
in general, deep learning-based models outperformed non- directly learn perceptible and selective features to identify
deep learning-based models. As the results are reported in such Deepfake. For example, Ding et al. [82] introduced a
Figure 11, the accuracy and precision performance in deep two-phase CNN method for Deepfake detection. The first
learning models are significantly better than non-deep learn- stage extracts particular features among counterfeit and actual
ing models. However, in the case of AUC and recall, the images by incorporating various dense units, where each of
performance is pretty similar. The overall results demonstrate them includes a list of dense blocks that are forged images.
the superiority of deep learning-based models over non-deep The second phase uses these features to train the proposed
learning-based models. CNN to classify the input images, whether fake or real.
Due to the typical use of lossy compression in video com-
IV. OBSERVATIONS pression, most detection techniques used in an image are
A. COMBINING DIFFERENT DEEP LEARNING METHODS not suitable for videos, as these methods degrade the frame
IS CRITICAL FOR THE ACCURATE dEEPFAKE DETECTION data. Because videos have temporal features and vary the
Based on the review, we see that multiple strategies are frames’ size, it is challenging for techniques to distinguish
applied using numerous features. In general, primary meth- just counterfeit images. In [68], a recurrent convolutional
ods used handcrafted features collected from face artifacts. model (RCN) was proposed to use these spatiotemporal fea-
Recent research applied deep learning-based approaches, tures [48]–[51] of videos for detecting Deepfakes. Likewise,

25506 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

FIGURE 9. The allocation of subcategories of detection models. ML: Machine Learning; DL: Deep Learning.

TABLE 8. Measurement metrics used by various studies.

Guera and Delp [66] discovered intra-frame and temporal videos is always lower than in real videos. It can easily extract
inconsistencies among the Deepfake videos’ frames. They from the eye areas based on six eye landmarks and use them
proposed a network composed of CNN and LSTM to detect as features.
such discrepancies in Deepfakes. In this architecture, CNN On the other hand, Rana and Sung [65] proposed a deep
handles extracting the frame-level features and LSTM to use ensemble learning strategy, namely, DeepfakeStack, to detect
these features as input to generate a descriptor accountable Deepfake by analyzing multiple deep learning models. The
for analyzing the temporal sequence. Besides, to use physical concept behind DeepfakeStack is to train a meta-learner to
indications [35]–[36], for example, eye blinking as features top base-learners with pre-trained experience. It provides
in detecting Deepfake, Li, et al. [46] proposed a long-term an interface for fitting the meta-learners on the base learn-
recurrent convolutional network (LRCN). Their method high- ers’ prediction and demonstrates how an ensemble method
lighted that the total eye blinking of an individual in Deepfake executes the role of classification. The DeepfakeStack

VOLUME 10, 2022 25507


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

FIGURE 10. Taxonomy of Deepfake detection techniques. This taxonomy classifies the detection algorithms according to the media (image, video,
or image and video), the features used (among the 12 features), the detection method (DL, ML, Blockchain, or statistical), and the clue for the detection
(facial manipulation of digital media forensics, or other indications). The size of the connection line reflects the relative count of papers.

architecture includes multiple base-learners, the level-0 TABLE 9. Performance of various detection methods.
model, and a meta-learner, a level-1 model. The experi-
ment reveals the DeepfakeStack achieves 99.65% accuracy
and 1.0 of AUROC.
To some extent, these deep learning methods are com-
plementary. In practice, combining multiple deep learning
methods could obtain improved results compared to a sin-
gle process. For example, the DeepfakeStack [65] integrates
multiple state-of-the-art classification algorithms focused on
deep learning and produces a sophisticated composite clas-
sifier that achieves 99.65%. Based on the RQ-1, it is seen
that a maximum number of studies have applied deep learn-
ing techniques for detecting Deepfake. Therefore, it may Deepfake from SRQ-2.3 has become a hot subject. We also
be appropriate to explore the compatibility of deep learning find that most studies follow a traditional CNN approach
methods and integrate some of them for further progress in to classify Deepfake in the deep-learning environment. Still,
Deepfake detection. researchers have not yet figured out how to determine Deep-
fake authorship.
Based on the outcomes of RQ-4, it is observed that the deep
B. DEEP LEARNING-BASED METHODS ARE learning-based models achieve better performance than the
RECOMMENDED IN DEEPFAKE DETECTION non-deep learning models in Deepfake detection. Therefore,
Compared with the traditional machine learning approaches, deep learning-based approaches are advised when detecting
we note that applying deep learning algorithms to detect Deepfake.

25508 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

cess is applied to the collected data, and we retrieved the


final data after we agreed on the comparative results. Nev-
ertheless, errors may still be present in how we collected
and processed data. We believe the original authors could
cross-check the reported results to avoid any unexpected
error.

C. EXTERNAL VALIDITY
It is about the summary of the results obtained from various
studies that we considered. To improve the quality of the
findings in RQ-3 and RQ-4 in future studies, we recommend
setting up a unique framework to reduce the inconsistencies
FIGURE 11. The comparison of the results among deep learning and in the results reported. Besides, more Deepfake detection
non-deep learning based models. experiments might be required to be obtained to produce
definitive and systematic outcomes.
C. A UNIQUE FRAMEWORK IS REQUIRED FOR THE FAIR
EVALUATION OF DIFFERENT HETEROGENEOUS VI. CONCLUSION
DEEPFAKE DETECTION METHODS This SLR presents various state-of-the-art methods for detect-
After reviewing the studies listed above, we note that several ing Deepfake published in 112 studies from the beginning
studies have used other datasets. Secondly, there are also dif- of 2018 to the end of 2020. We present basic techniques
ferences with specific experiments that use the same dataset. and discuss different detection models’ efficacy in this work.
(1) The measurement metrics used in the studies in question We summarize the overall study as follows:
are not standard. For example, some experiments evaluate the • The deep learning-based methods are widely used in
performance of detection tasks using Accuracy and AUROC. detecting Deepfake.
Some studies use precision and recall only; (2) In these • In the experiments, the FF++ dataset occupies the largest
studies, it is also seen that the dataset’s size is not consistent. proportion.
For example, the FF++ dataset has 1000 Deepfake videos, • The deep learning (mainly CNN) models hold a signifi-
but a few studies use the entire dataset while others use half. cant percentage of all the models.
Some studies use only 100 videos; (3) The initial videos in • The most widely used performance metric is detection
these experiments are hardly available in public. The above accuracy.
conditions may lessen the trustworthiness of these RQ-3 and • The experimental results demonstrate that deep learning
RQ-4 findings. techniques are effective in detecting Deepfake. Further,
Based on the above circumstances, this section concludes it can be stated that, in general, the deep learning models
that creating a unique framework for the fair assessment of outperform the non-deep learning models.
the performance is essential.
With the rapid progress in underlying multimedia tech-
nology and the proliferation of tools and applications,
V. LIMITATIONS AND CHALLENGES
Deepfake detection still faces many challenges. We hope
This section will discuss some limitations and challenges that
this SLR provides a valuable resource for the research
we observed during the preparation of this SLR.
community in developing effective detection methods and
countermeasures.
A. CONSTRUCT VALIDITY
It is related to the collection of studies. We compile the asso- REFERENCES
ciated articles from journals, seminars, conferences, work- [1] FaceApp. Accessed: Jan. 4, 2021. [Online]. Available: https://www.
shops, and archives of many electronic libraries in this faceapp.com/
SLR. It is still possible that some of the related papers [2] FakeApp. Accessed: Jan. 4, 2021. [Online]. Available: https://www.
fakeapp.org/
might still be missing from our collection of studies. Fur- [3] G. Oberoi. Exploring DeepFakes. Accessed: Jan. 4, 2021. [Online].
ther, we might have a few mistakes sorting these exper- Available: https://goberoi.com/exploring-deepfakes-20c9947c22d9
iments through the selection or rejection parameters we [4] J. Hui. How Deep Learning Fakes Videos (Deepfake) and How to
Detect it. Accessed: Jan. 4, 2021. [Online]. Available: https://medium.
used in the process. We evaluated our catalog of stud- com/how-deep-learning-fakes-videos-deepfakes-and-how-to-detect-it-
ies using a double-checking approach to address such c0b50fbf7cb9
errors. [5] I. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair,
A. Courville, and Y. Bengio, ‘‘Generative adversarial nets,’’ in Proc. 27th
Int. Conf. Neural Inf. Process. Syst. (NIPS), vol. 2. Cambridge, MA, USA:
B. INTERNAL VALIDITY MIT Press, 2014, pp. 2672–2680.
The internal validity is related to data extraction and anal- [6] G. Patrini, F. Cavalli, and H. Ajder, ‘‘The state of deepfakes:
Reality under attack,’’ Deeptrace B.V., Amsterdam, The Nether-
ysis. The present work involved an intense workload of lands, Annu. Rep. v.2.3., 2018. [Online]. Available: https://s3.eu-west-
data extraction and data processing. The cross-checking pro- 2.amazonaws.com/rep2018/2018-the-state-of-deepfakes.pdf

VOLUME 10, 2022 25509


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

[7] J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Niessner, [29] R. Durall, M. Keuper, F.-J. Pfreundt, and J. Keuper, ‘‘Unmasking Deep-
‘‘Face2Face: Real-time face capture and reenactment of RGB videos,’’ Fakes with simple features,’’ 2019, arXiv:1911.00686.
in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, [30] P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, ‘‘Two-stream neural
NV, USA, Jun. 2016, pp. 2387–2395, doi: 10.1109/CVPR.2016.262. networks for tampered face detection,’’ in Proc. IEEE Conf. Comput. Vis.
[8] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, ‘‘Unpaired image-to-image Pattern Recognit. Workshops (CVPRW), Honolulu, HI, USA, Jul. 2017,
translation using cycle-consistent adversarial networks,’’ in Proc. IEEE pp. 1831–1839, doi: 10.1109/CVPRW.2017.229.
Int. Conf. Comput. Vis. (ICCV), Venice, Oct. 2017, pp. 2242–2251, doi: [31] K. Songsri-in and S. Zafeiriou, ‘‘Complement face forensic detection and
10.1109/ICCV.2017.244. localization with faciallandmarks,’’ 2019, arXiv:1910.05455.
[9] S. Suwajanakorn, S. M. Seitz, and I. K. Shlizerman, ‘‘Synthesizing [32] A. Kumar and A. Bhavsar, ‘‘Detecting deepfakes with metric learning,’’
Obama: Learning lip sync from audio,’’ ACM Trans. Graph., vol. 36, 2020, arXiv:2003.08645.
no. 4, p. 95, 2017. [33] X. Zhang, S. Karaman, and S.-F. Chang, ‘‘Detecting and simulating
[10] L. Matsakis. Artificial Intelligence is Now Fighting Fake vid. Accessed: artifacts in GAN fake images,’’ in Proc. IEEE Int. Workshop Inf. Forensics
Jan. 4, 2021. [Online]. Available: https://www.wired.com/story/gfycat- Secur. (WIFS), Dec. 2019, pp. 1–6.
artificial-intelligence-deepfakes/ [34] P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, ‘‘Learning rich fea-
[11] A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, tures for image manipulation detection,’’ in Proc. IEEE/CVF Conf.
‘‘FaceForensics: A large-scale video dataset for forgery detection in Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, Jun. 2018,
human faces,’’ 2018, arXiv:1803.09179. pp. 1053–1061, doi: 10.1109/CVPR.2018.00116.
[12] H. Kim, P. Garrido, A. Tewari, W. Xu, J. Thies, M. Niessner, [35] K. Chugh, P. Gupta, A. Dhall, and R. Subramanian, ‘‘Not made for each
P. Pérez, C. Richardt, M. Zollhöfer, and C. Theobalt, ‘‘Deep video por- other- audio-visual dissonance-based deepfake detection and localiza-
traits,’’ ACM Trans. Graph., vol. 37, no. 4, pp. 1–14, Aug. 2018, doi: tion,’’ 2020, arXiv:2005.14405.
10.1145/3197517.3201283. [36] H. Qi, Q. Guo, F. Juefei-Xu, X. Xie, L. Ma, W. Feng, Y. Liu, and
[13] C. Chan, S. Ginosar, T. Zhou, and A. A. Efros, ‘‘Everybody dance now,’’ J. Zhao, ‘‘DeepRhythm: Exposing deepfakes with attentional visual heart-
2018, arXiv:1808.07371. beat rhythms,’’ 2020, arXiv:2006.07634.
[14] T. Karras, S. Laine, and T. Aila, ‘‘A style-based generator architecture [37] S. Fernandes, S. Raj, E. Ortiz, I. Vintila, M. Salter, G. Urosevic, and
for generative adversarial networks,’’ in Proc. IEEE/CVF Conf. Com- S. Jha, ‘‘Predicting heart rate variations of deepfake videos using neural
put. Vis. Pattern Recognit. (CVPR), Long Beach, CA, USA, Jun. 2019, ODE,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshop (ICCVW),
pp. 4396–4405, doi: 10.1109/CVPR.2019.00453. Oct. 2019, pp. 1721–1729.
[15] D. Budgen and P. Brereton, ‘‘Performing systematic literature reviews in [38] J. Hernandez-Ortega, R. Tolosana, J. Fierrez, and A. Morales,
software engineering,’’ in Proc. 28th Int. Conf. Softw. Eng., New York, ‘‘DeepFakesON-phys: DeepFakes detection based on heart rate estima-
NY, USA, May 2006, pp. 1051–1052, doi: 10.1145/1134285.1134500. tion,’’ 2020, arXiv:2010.00400.
[16] Z. Stapic, E. G. Lopez, A. G. Cabot, L. M. Ortega, and V. Strahonja, ‘‘Per- [39] J. Bappy, C. Simons, L. Nataraj, B. Manjunath, and A. R. Chowdhury,
forming systematic literature review in software engineering,’’ in Proc. ‘‘Hybrid LSTM and encoder–decoder architecture for detection of image
23rd Central Eur. Conf. Inf. Intell. Syst. (CECIIS), Varazdin, Croatia, forgeries,’’ IEEE Trans. Image Process., vol. 28, no. 7, pp. 3286–3300,
Sep. 2012, pp. 441–447. Jul. 2019.
[17] B. Kitchenham, ‘‘Procedures for performing systematic reviews,’’ [40] D. Afchar, V. Nozick, J. Yamagishi, and I. Echizen, ‘‘MesoNet: A com-
Softw. Eng. Group; Nat. ICT Aust., Keele; Eversleigh, Keele Univ., pact facial video forgery detection network,’’ in Proc. IEEE Int. Workshop
Keele, U.K., Tech. Rep. TR/SE-0401; NICTA Tech. Rep. 0400011T.1, Inf. Forensics Secur. (WIFS), Dec. 2018, pp. 1–7.
2004. [41] P. Kawa and P. Syga, ‘‘A note on deepfake detection with low-resources,’’
[18] B. Kitchenham and S. Charters, ‘‘Guidelines for performing systematic 2020, arXiv:2006.05183.
literature reviews in software engineering,’’ Softw. Eng. Group; Keele [42] A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and
Univ., Durham University Joint, Durham, U.K., Tech. Rep. EBSE-2007- M. Niessner, ‘‘FaceForensics++: Learning to detect manipulated facial
01, 2007. images,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019,
[19] M. A. Babar and H. Zhang, ‘‘Systematic literature reviews in software pp. 1–11.
engineering: Preliminary results from interviews with researchers,’’ in [43] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, ‘‘Densely
Proc. 3rd Int. Symp. Empirical Softw. Eng. Meas., Lake Buena Vista, FL, connected convolutional networks,’’ in Proc. IEEE Conf. Comput. Vis.
USA, Oct. 2009, pp. 346–355, doi: 10.1109/ESEM.2009.5314235. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 2261–2269,
[20] H. Do, S. Elbaum, and G. Rothermel, ‘‘Supporting controlled experimen- doi: 10.1109/CVPR.2017.243.
tation with testing techniques: An infrastructure and its potential impact,’’ [44] F. Chollet, ‘‘Xception: Deep learning with depthwise separable convo-
Empirical Softw. Eng., vol. 10, no. 4, pp. 405–435, 2005. lutions,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
[21] F. Matern, C. Riess, and M. Stamminger, ‘‘Exploiting visual artifacts Jul. 2017, pp. 1800–1807.
to expose deepfakes and face manipulations,’’ in Proc. IEEE Winter [45] A. Khodabakhsh and C. Busch, ‘‘A generalizable deepfake detector based
Appl. Comput. Vis. Workshops (WACVW), Waikoloa Village, HI, USA, on neural conditional distribution modelling,’’ in Proc. Int. Conf. Biomet-
Jan. 2019, pp. 83–92, doi: 10.1109/WACVW.2019.00020. rics Special Interest Group (BIOSIG), Darmstadt, Germany, Sep. 2020,
[22] U. A. Ciftci, I. Demir, and L. Yin, ‘‘FakeCatcher: Detection of pp. 1–5.
synthetic portrait videos using biological signals,’’ IEEE Trans. [46] Y. Li, M.-C. Chang, and S. Lyu, ‘‘In ictu oculi: Exposing AI created
Pattern Anal. Mach. Intell., early access, Jul. 15, 2020, doi: fake videos by detecting eye blinking,’’ in Proc. IEEE Int. Workshop Inf.
10.1109/TPAMI.2020.3009287. Forensics Secur. (WIFS), Dec. 2018, pp. 1–7.
[23] X. Li, Y. Lang, Y. Chen, X. Mao, Y. He, S. Wang, H. Xue, and Q. Lu, [47] Y. Li and S. Lyu, ‘‘Exposing deepfake videos by detecting face
‘‘Sharp multiple instance learning for deepfake video detection,’’ 2020, warping artifacts,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
arXiv:2008.04585. Recognit. (CVPR) Workshops, 2019, pp. 46–52. [Online]. Available:
[24] L. Guarnera, O. Giudice, and S. Battiato, ‘‘Fighting deepfake by exposing https://openaccess.thecvf.com/content_CVPRW_2019/html/Media_
the convolutional traces on images,’’ 2020, arXiv:2008.04095. Forensics/Li_Exposing_DeepFake_Videos_By_Detecting_Face_
[25] M. Bonomi, C. Pasquini, and G. Boato, ‘‘Dynamic texture analysis for Warping_Artifacts_CVPRW_2019_paper.html
detecting fake faces in video sequences,’’ 2020, arXiv:2007.15271. [48] I. Ganiyusufoglu, L. Minh Ngô, N. Savov, S. Karaoglu, and T. Gevers,
[26] L. Guarnera, O. Giudice, and S. Battiato, ‘‘Fighting deepfake by exposing ‘‘Spatio-temporal features for generalized detection of deepfake videos,’’
the convolutional traces on images,’’ 2020, arXiv:2008.04095. 2020, arXiv:2010.11844.
[27] X. Yang, Y. Li, and S. Lyu, ‘‘Exposing deep fakes using inconsis- [49] A. Singh, A. S. Saimbhi, N. Singh, and M. Mittal, ‘‘Deepfake video
tent head poses,’’ in Proc. IEEE Int. Conf. Acoust., Speech Signal detection: A time-distributed approach,’’ SN Comput. Sci., vol. 1, p. 212,
Process. (ICASSP), Brighton, U.K., May 2019, pp. 8261–8265, doi: Jun. 2020, doi: 10.1007/s42979-020-00225-9.
10.1109/ICASSP.2019.8683164. [50] I. Kukanov, J. Karttunen, H. Sillanpää, and V. Hautamäki, ‘‘Cost sensitive
[28] S. Agarwal, H. Farid, Y. Gu, M. He, K. Nagano, and H. Li, ‘‘Protecting optimization of deepfake detector,’’ 2020, arXiv:2012.04199.
world leaders against deep fakes,’’ in Proc. IEEE Conf. Comput. Vis. [51] A. Haliassos, K. Vougioukas, S. Petridis, and M. Pantic, ‘‘Lips don’t lie:
Pattern Recognit. (CVPR) Workshops, Long Beach, CA, USA, Jun. 2019, A generalisable and robust approach to face forgery detection,’’ 2020,
pp. 1–8. arXiv:2012.07657.

25510 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

[52] X. Zhu, H. Wang, H. Fei, Z. Lei, and S. Z. Li, ‘‘Face forgery detection by [73] D. Cozzolino, J. Thies, A. Rössler, C. Riess, M. Nießner, and L. Verdoliva,
3D decomposition,’’ 2020, arXiv:2011.09737. ‘‘ForensicTransfer: Weakly-supervised domain adaptation for forgery
[53] X. Wang, T. Yao, S. Ding, and L. Ma, ‘‘Face manipulation detection detection,’’ 2018, arXiv:1812.02510.
via auxiliary supervision,’’ in Neural Information Processing (ICONIP) [74] H. H. Nguyen, F. Fang, J. Yamagishi, and I. Echizen, ‘‘Multi-task learning
(Lecture Notes in Computer Science), vol. 12532, H. Yang, K. Pasupa, for detecting and segmenting manipulated facial images and videos,’’
A. C. Leung, J. T. Kwok, J. H. Chan, I. King, Eds. Cham, Switzerland: in Proc. IEEE 10th Int. Conf. Biometrics Theory, Appl. Syst. (BTAS),
Springer, 2020, pp. 313–324, doi: 10.1007/978-3-030-63830-6_27. Sep. 2019, pp. 1–8.
[54] M. T. Jafar, M. Ababneh, M. Al-Zoube, and A. Elhassan, ‘‘Foren- [75] M. Du, S. Pentyala, Y. Li, and X. Hu, ‘‘Towards generalizable deepfake
sics and analysis of deepfake videos,’’ in Proc. 11th Int. Conf. Inf. detection with locality-aware autoencoder,’’ 2019, arXiv:1909.05999.
Commun. Syst. (ICICS), Irbid, Jordan, Apr. 2020, pp. 053–058, doi: [76] L. Trinh, M. Tsang, S. Rambhatla, and Y. Liu, ‘‘Interpretable
10.1109/ICICS49469.2020.239493. and trustworthy deepfake detection via dynamic prototypes,’’ 2020,
[55] X. Dong, J. Bao, D. Chen, W. Zhang, N. Yu, D. Chen, F. Wen, and arXiv:2006.15473.
B. Guo, ‘‘Identity-driven deepfake detection,’’ 2020, arXiv:2012.03930. [77] M. Du, S. Pentyala, Y. Li, and X. Hu, ‘‘Towards generalizable deep-
[56] T. Zhao, X. Xu, M. Xu, H. Ding, Y. Xiong, and W. Xia, ‘‘Learning self- fake detection with locality-aware autoencoder,’’ in Proc. 29th ACM
consistency for deepfake detection,’’ 2020, arXiv:2012.09311. Int. Conf. Inf. Knowl. Manage., New York, NY, USA, Oct. 2020, doi:
[57] L. Bondi, E. Daniele Cannas, P. Bestagini, and S. Tubaro, ‘‘Training 10.1145/3340531.3411892.
strategies and data augmentations in CNN-based deepfake video detec- [78] T. Fernando, C. Fookes, S. Denman, and S. Sridharan, ‘‘Exploiting human
tion,’’ 2020, arXiv:2011.07792. social cognition for the detection of fake and fraudulent faces via memory
[58] Z. Hongmeng, Z. Zhiqiang, S. Lei, M. Xiuqing, and W. Yuehan, networks,’’ 2019, arXiv:1911.07844.
‘‘A detection method for deepfake hard compressed videos based on [79] L. Li, J. Bao, T. Zhang, H. Yang, D. Chen, F. Wen, and B. Guo, ‘‘Face
super-resolution reconstruction using CNN,’’ in Proc. 4th High Perform. X-ray for more general face forgery detection,’’ 2019, arXiv:1912.13458.
Comput. Cluster Technol. Conf. 3rd Int. Conf. Big Data Artif. Intell., [80] Z. Chen and H. Yang, ‘‘Attentive semantic exploring for manipulated face
New York, NY, USA, Jul. 2020, pp. 98–103, doi: 10.1145/3409501. detection,’’ 2020, arXiv:2005.02958.
3409542. [81] T. D. Nhu, I. S. Na, H. J. Yang, G. S. Lee, and S. H. Kim, ‘‘Forensics face
[59] J. Han and T. Gevers, ‘‘MMD based discriminative learning for face detection from GANs using convolutional neural network,’’ in Proc. Int.
forgery detection,’’ in Proc. Asian Conf. Comput. Vis. (ACCV), 2020, Symp. Inf. Technol. Converg. (ISITC), 2018, pp. 1–5.
pp. 121–136. [82] X. Ding, Z. Raziei, E. C. Larson, E. V. Olinick, P. Krueger, and
[60] L. Verdoliva, ‘‘Media forensics and DeepFakes: An overview,’’ 2020, M. Hahsler, ‘‘Swapped face detection using deep learning and subjec-
arXiv:2001.06564. tive assessment,’’ EURASIP J. Inf. Secur., vol. 2020, no. 1, pp. 1–12,
Dec. 2020, doi: 10.1186/s13635-020-00109-8.
[61] H. Dang, F. Liu, J. Stehouwer, X. Liu, and A. K. Jain, ‘‘On the detection
[83] Z. Guo, G. Yang, J. Chen, and X. Sun, ‘‘Fake face detection via adaptive
of digital face manipulation,’’ in Proc. IEEE/CVF Conf. Comput. Vis.
manipulation traces extraction network,’’ 2020, arXiv:2005.04945.
Pattern Recognit. (CVPR), Seattle, WA, USA, Jun. 2020, pp. 5780–5789,
doi: 10.1109/CVPR42600.2020.00582. [84] D. Mas Montserrat, H. Hao, S. K. Yarlagadda, S. Baireddy, R. Shao,
J. Horváth, E. Bartusiak, J. Yang, D. Güera, F. Zhu, and E. J. Delp, ‘‘Deep-
[62] H. H. Nguyen, J. Yamagishi, and I. Echizen, ‘‘Capsule-forensics: Using
fakes detection with automatic face weighting,’’ 2020, arXiv:2004.12027.
capsule networks to detect forged images and videos,’’ in Proc. IEEE
[85] L. M. Dang, S. I. Hassan, S. Im, and H. Moon, ‘‘Face image manipulation
Int. Conf. Acoust., Speech Signal Process. (ICASSP), Brighton, U.K.,
detection based on a convolutional neural network,’’ Expert Syst. Appl.,
May 2019, pp. 2307–2311, doi: 10.1109/ICASSP.2019.8682602.
vol. 129, pp. 156–168, Sep. 2019.
[63] H. H. Nguyen, J. Yamagishi, and I. Echizen, ‘‘Use of a capsule network
[86] Z. Liu, X. Qi, J. Jia, and P. H. S. Torr, ‘‘Real or fake: An empirical study
to detect fake images and videos,’’ 2019, arXiv:1910.12467.
and improved model for fake face detection,’’ in Proc. 8th Int. Conf.
[64] N. Bonettini, E. Daniele Cannas, S. Mandelli, L. Bondi, P. Bestagini, Learn. Represent. (ICLR), Apr. 2020, pp. 1–12.
and S. Tubaro, ‘‘Video face manipulation detection through ensemble of
[87] R. Durall, M. Keuper, and J. Keuper, ‘‘Watch your up-convolution:
CNNs,’’ 2020, arXiv:2004.07676.
CNN based generative deep neural networks are failing to reproduce
[65] M. S. Rana and A. H. Sung, ‘‘DeepfakeStack: A deep ensemble-based spectral distributions,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern
learning technique for deepfake detection,’’ in Proc. 7th IEEE Int. Conf. Recognit. (CVPR), Seattle, WA, USA, Jun. 2020, pp. 7887–7896, doi:
Cyber Secur. Cloud Comput. (CSCloud)/6th IEEE Int. Conf. Edge Com- 10.1109/CVPR42600.2020.00791.
put. Scalable Cloud (EdgeCom), New York, NY, USA, Aug. 2020, [88] M. A. S. Habeeba, A. Lijiya, and A. M. Chacko, ‘‘Detection of deepfakes
pp. 70–75, doi: 10.1109/CSCloud-EdgeCom49738.2020.00021. using visual artifacts and neural network classifier,’’ in Innovations in
[66] D. Guera and E. J. Delp, ‘‘Deepfake video detection using recurrent Electrical and Electronic Engineering (Lecture Notes in Electrical Engi-
neural networks,’’ in Proc. 15th IEEE Int. Conf. Adv. Video Signal Based neering), vol. 661, M. Favorskaya, S. Mekhilef, R. Pandey, and N. Singh,
Surveill. (AVSS), Nov. 2018, pp. 1–6. Eds. Singapore: Springer, 2020, pp. 411–422, doi: 10.1007/978-981-15-
[67] S. Sohrawardi, A. Chintha, B. Thai, S. Seng, A. Hickerson, R. Ptucha, 4692-1_31.
and M. Wright, ‘‘Poster: Towards robust open-world detection of deep- [89] C.-C. Hsu, Y.-X. Zhuang, and C.-Y. Lee, ‘‘Deep fake image detection
fakes,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., 2019, based on pairwise learning,’’ Appl. Sci., vol. 10, no. 1, p. 370, Jan. 2020,
pp. 2613–2615. doi: 10.3390/app10010370.
[68] E. Sabir, J. Cheng, A. Jaiswal, W. Abd-Almageed, I. Masi, and [90] S. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, ‘‘CNN-
P. Natarajan, ‘‘Recurrent convolutional strategies for face manip- generated images are surprisingly easy to spot. . . for now,’’ in Proc.
ulation detection in videos,’’ in Proc. CVPR Workshops, 2019, IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA,
pp. 80–87. USA, Jun. 2020, pp. 8692–8701, doi: 10.1109/CVPR42600.2020.00872.
[69] S. Tariq, S. Lee, and S. S. Woo, ‘‘A convolutional LSTM based residual [91] P. Korshunov and S. Marcel, ‘‘DeepFakes: A new threat to face recogni-
network for deepfake video detection,’’ 2020, arXiv:2009.07480. tion? Assessment and detection,’’ 2018, arXiv:1812.08685.
[70] I. Masi, A. Killekar, R. M. Mascarenhas, S. P. Gurudatt, and [92] A. Gandhi and S. Jain, ‘‘Adversarial perturbations fool deepfake detec-
W. Abd-Almageed, ‘‘Two-branch recurrent network for isolating deep- tors,’’ in Proc. Int. Joint Conf. Neural Netw. (IJCNN), Jul. 2020, pp. 1–8.
fakes in videos,’’ in Proc. 16th Eur. Conf. Comput. Vis., Aug. 2020, [93] K. Zhu, B. Wu, and B. Wang, ‘‘Deepfake detection with clustering-
pp. 667–684. based embedding regularization,’’ in Proc. IEEE 5th Int. Conf. Data
[71] A. Chintha, B. Thai, S. J. Sohrawardi, K. Bhatt, A. Hickerson, Sci. Cyberspace (DSC), Hong Kong, Jul. 2020, pp. 257–264, doi:
M. Wright, and R. Ptucha, ‘‘Recurrent convolutional structures for audio 10.1109/DSC50466.2020.00046.
spoof and video deepfake detection,’’ IEEE J. Sel. Topics Signal Process., [94] P. Charitidis, G. Kordopatis-Zilos, S. Papadopoulos, and I. Kompatsiaris,
vol. 14, no. 5, pp. 1024–1037, Aug. 2020, doi: 10.1109/JSTSP.2020. ‘‘Investigating the impact of pre-processing and prediction aggregation
2999185. on the DeepFake detection task,’’ 2020, arXiv:2006.07084.
[72] I. Amerini, L. Galteri, R. Caldelli, and A. Del Bimbo, ‘‘Deepfake video [95] P. Charitidis, G. Kordopatis-Zilos, S. Papadopoulos, and I. Kompatsiaris,
detection through optical flow based CNN,’’ in Proc. IEEE/CVF Int. Conf. ‘‘Investigating the impact of pre-processing and prediction aggregation
Comput. Vis. Workshop (ICCVW), Oct. 2019, pp. 1205–1207. on the deepfake detection task,’’ 2020, arXiv:2006.07084.

VOLUME 10, 2022 25511


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

[96] X. Li, K. Yu, S. Ji, Y. Wang, C. Wu, and H. Xue, ‘‘Fighting against [119] R. Tolosana, S. Romero-Tapiador, J. Fierrez, and R. Vera-Rodriguez,
deepfake: Patch&pair convolutional neural networks (PPCNN),’’ in Proc. ‘‘DeepFakes evolution: Analysis of facial regions and fake detection
Companion Web Conf., New York, NY, USA, 2020, pp. 88–89, doi: performance,’’ 2020, arXiv:2004.07532.
10.1145/3366424.3382711. [120] S. Lyu, ‘‘Deepfake detection: Current challenges and next steps,’’ in Proc.
[97] C. X. T. Du, L. H. Duong, H. T. Trung, P. M. Tam, N. Q. V. Hung, IEEE Int. Conf. Multimedia Expo Workshops (ICMEW), London, U.K.,
and J. Jo, ‘‘Efficient-frequency: A hybrid visual forensic framework for Jul. 2020, pp. 1–6, doi: 10.1109/ICMEW46912.2020.9105991.
facial forgery detection,’’ in Proc. IEEE Symp. Ser. Comput. Intell. (SSCI), [121] Y. Mirsky and W. Lee, ‘‘The creation and detection of deepfakes: A
Canberra, ACT, Australia, Dec. 2020, pp. 707–712. survey,’’ 2020, arXiv:2004.11138.
[98] D. Cozzolino, A. Rössler, J. Thies, M. Nießner, and [122] L. Guarnera, O. Giudice, C. Nastasi, and S. Battiato, ‘‘Preliminary foren-
L. Verdoliva, ‘‘ID-reveal: Identity-aware DeepFake video detection,’’ sics analysis of deepfake images,’’ 2020, arXiv:2004.12626.
2020, arXiv:2012.02512. [123] A. O. J. Kwok and S. G. M. Koh, ‘‘Deepfake: A social construction
[99] W. Zhang, C. Zhao, and Y. Li, ‘‘A novel counterfeit feature extrac- of technology perspective,’’ Current Issues Tourism, vol. 24, no. 13,
tion technique for exposing face-swap images based on deep learning pp. 1798–1802, 2020, doi: 10.1080/13683500.2020.1738357.
and error level analysis,’’ Entropy, vol. 22, no. 2, p. 249, 2020, doi: [124] J. Kietzmann, L. W. Lee, I. P. McCarthy, and T. C. Kietzmann, ‘‘Deep-
10.3390/e22020249. fakes: Trick or treat?’’ Bus. Horizons, vol. 63, no. 2, pp. 135–146,
[100] T. Mittal, U. Bhattacharya, R. Chandra, A. Bera, and D. Manocha, Mar. 2020, doi: 10.1016/j.bushor.2019.11.006.
‘‘Emotions don’t lie: An audio-visual deepfake detection method using [125] J. Frank, T. Eisenhofer, L. Schonherr, A. Fischer, D. Kolossa, and
affective cues,’’ in Proc. 28th ACM Int. Conf. Multimedia, Seattle, WA, T. Holz, ‘‘Leveraging frequency analysis for deep fake image recog-
USA, Oct. 2020, doi: 10.1145/3394171.3413570. nition,’’ in Proc. 37th Int. Conf. Mach. Learn. (ICML), Jul. 2020,
[101] Y. Nirkin, L. Wolf, Y. Keller, and T. Hassner, ‘‘DeepFake detec- pp. 3247–3258.
tion based on discrepancies between faces and their context,’’ 2020, [126] M.-H. Maras and A. Alexandrou, ‘‘Determining authenticity of video
arXiv:2008.12262. evidence in the age of artificial intelligence and in the wake of deepfake
[102] C. M. Yu, C. T. Chang, and Y. W. Ti, ‘‘Detecting deepfake-forged contents videos,’’ Int. J. Evidence Proof, vol. 23, no. 3, pp. 255–262, Jul. 2019,
with separable convolutional neural network and image segmentation,’’ doi: 10.1177/1365712718807226.
2019, arXiv:1912.12184. [127] M. Westerlund, ‘‘The emergence of deepfake technology: A review,’’
[103] D. Feng, X. Lu, and X. Lin, ‘‘Deep detection for face manipulation,’’ Technol. Innov. Manage. Rev., vol. 9, no. 11, pp. 40–53, 2019, doi:
2020, arXiv:2009.05934. 10.22215/timreview/1282.
[104] L. Chai, D. Bau, S. Lim, and P. Isola, ‘‘What makes fake images [128] C. Öhman, ‘‘Introducing the pervert’s dilemma: A contribution to the
detectable? Understanding properties that generalize,’’ in Proc. 16th Eur. critique of deepfake,’’ Ethics and Inf. Technol., vol. 22,pp. 133–140,
Conf. Comput. Vis., Aug. 2020, pp. 103–120. Nov. 2020, doi: 10.1007/s10676-019-09522-1.
[105] X. Chang, J. Wu, T. Yang, and G. Feng, ‘‘DeepFake face image detection [129] R. Thakur and R. Rohilla, ‘‘Recent advances in digital image manipula-
based on improved VGG convolutional neural network,’’ in Proc. 39th tion detection techniques: A brief review,’’ Forensic Sci. Int., vol. 312,
Chin. Control Conf. (CCC), Shenyang, China, Jul. 2020, pp. 7252–7256, Jul. 2020, Art. no. 110311, doi: 10.1016/j.forsciint.2020.110311.
[130] N. Carlini and H. Farid, ‘‘Evading deepfake-image detectors with white-
doi: 10.23919/CCC50068.2020.9189596.
and black-box attacks,’’ 2020, arXiv:2004.00622.
[106] U. Aybars Ciftci, I. Demir, and L. Yin, ‘‘How do the hearts of deep fakes
[131] A. Pishori, B. Rollins, N. van Houten, N. Chatwani, and O. Uraimov,
beat? Deep fake source detection via interpreting residuals with biological
‘‘Detecting deepfake videos: An analysis of three techniques,’’ 2020,
signals,’’ 2020, arXiv:2008.11363.
arXiv:2007.08517.
[107] H. M. Nguyen and R. Derakhshani, ‘‘Eyebrow recognition for identifying
[132] O. de Lima, S. Franklin, S. Basu, B. Karwoski, and A. George, ‘‘Deep-
deepfake videos,’’ in Proc. Int. Conf. Biometrics Special Interest Group
fake detection using spatiotemporal convolutional networks,’’ 2020,
(BIOSIG), Darmstadt, Germany, Sep. 2020, pp. 1–5.
arXiv:2006.14749.
[108] M. Koopman, A. M. Rodriguez, and Z. Geradts, ‘‘Detection of deepfake [133] S. Hussain, P. Neekhara, M. Jere, F. Koushanfar, and J. McAuley, ‘‘Adver-
video manipulation,’’ in Proc. 20th Irish Mach. Vis. Image Process. Conf. sarial deepfakes: Evaluating vulnerability of deepfake detectors to adver-
(IMVIP), London, U.K., 2018, pp. 1–4. sarial examples,’’ 2020, arXiv:2002.12749.
[109] B. L. Welch, ‘‘The generalization of students’ problem when sev- [134] P. Korshunov and S. Marcel, ‘‘Deepfake detection: Humans vs.
eral different population variances are involved,’’ Biometrika, vol. 34, machines,’’ 2020, arXiv:2009.03155.
nos. 1–2, pp. 28–35, 1947. [135] H. U. U. Chi Maduakor and R. E. Alo Williams, ‘‘Integrating deepfake
[110] L. Guarnera, O. Giudice, and S. Battiato, ‘‘DeepFake detection by ana- detection into cybersecurity curriculum,’’ in Proc. Future Technol. Conf.
lyzing convolutional traces,’’ in Proc. IEEE/CVF Conf. Comput. Vis. (FTC) (Advances in Intelligent Systems and Computing), vol. 1288,
Pattern Recognit. Workshops (CVPRW), Seattle, WA, USA, Jun. 2020, K. Arai, S. Kapoor, and R. Bhatia, Eds. Cham, Switzerland: Springer,
pp. 2841–2850, doi: 10.1109/CVPRW50498.2020.00341. 2020, pp. 588–598, doi: 10.1007/978-3-030-63128-4_45.
[111] S. Agarwal and L. R. Varshney, ‘‘Limits of deepfake detection: A robust [136] Contributing Data to Deepfake Detection Research. Accessed:
estimation viewpoint,’’ in Proc. 36th Int. Conf. Mach. Learn. (ICML), Jan. 4, 2021. [Online]. Available: https://ai.googleblog.com/2019/09/
Long Beach, CA, USA, 2019. contributing-data-to-deepfake-detection.html
[112] U. M. Maurer, ‘‘Authentication theory and hypothesis testing,’’ IEEE [137] Z. Liu, P. Luo, X. Wang, and X. Tang, ‘‘Deep learning face attributes
Trans. Inf. Theory, vol. 46, no. 4, pp. 1350–1356, Jul. 2000, doi: in the wild,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015,
10.1109/18.850674. pp. 3730–3738, doi: 10.1109/ICCV.2015.425.
[113] H. Hasan and K. Salah, ‘‘Combating deepfake videos using blockchain [138] Y. Li, X. Yang, P. Sun, H. Qi, and S. Lyu, ‘‘Celeb-DF: A large-scale
and smart contracts,’’ IEEE Access, vol. 7, pp. 41596–41606, 2019, doi: challenging dataset for deepfake forensics,’’ 2019, arXiv:1909.12962.
10.1109/ACCESS.2019.2905689. [139] B. Dolhansky, R. Howes, B. Pflaum, N. Baram, and C. Canton Ferrer,
[114] IPFS Powers the Distributed Web. Accessed: Jun. 5, 2020. [Online]. ‘‘The deepfake detection challenge (DFDC) preview dataset,’’ 2019,
Available: https://ipfs.io/ arXiv:1910.08854.
[115] C. C. Ki Chan, V. Kumar, S. Delaney, and M. Gochoo, ‘‘Combating [140] L. Jiang, R. Li, W. Wu, C. Qian, and C. Change Loy, ‘‘DeeperForensics-
deepfakes: Multi-LSTM and blockchain as proof of authenticity for 1.0: A large-scale dataset for real-world face forgery detection,’’ 2020,
digital media,’’ in Proc. IEEE/ITU Int. Conf. Artif. Intell. Good (AI4G), arXiv:2001.03024.
Sep. 2020, pp. 55–62. [141] B. Zi, M. Chang, J. Chen, X. Ma, and Y.-G. Jiang, ‘‘WildDeepfake:
[116] J. Li, T. Shen, W. Zhang, H. Ren, D. Zeng, and T. Mei, ‘‘Zooming into A challenging real-world dataset for deepfake detection,’’ in Proc.
face forensics: A pixel-level analysis,’’ 2019, arXiv:1912.05790. 28th ACM Int. Conf. Multimedia, New York, NY, USA, Oct. 2020,
[117] T. Thi Nguyen, Q. Viet Hung Nguyen, D. Tien Nguyen, D. Thanh pp. 2382–2390, doi: 10.1145/3394171.3413769.
Nguyen, T. Huynh-The, S. Nahavandi, T. Tam Nguyen, Q.-V. Pham, and [142] A. Khodabakhsh, R. Ramachandra, K. Raja, P. Wasnik, and C. Busch,
C. M. Nguyen, ‘‘Deep learning for deepfakes creation and detection: A ‘‘Fake face detection methods: Can they be generalized?’’ in Proc. Int.
survey,’’ 2019, arXiv:1909.11573. Conf. Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany,
[118] R. Tolosana, R. Vera-Rodriguez, J. Fierrez, A. Morales, and Sep. 2018, pp. 1–6, doi: 10.23919/BIOSIG.2018.8553251.
J. Ortega-Garcia, ‘‘Deepfakes and beyond: A survey of face manipulation [143] P. Gupta, K. Chugh, A. Dhall, and R. Subramanian, ‘‘The eyes know it:
and fake detection,’’ Inf. Fusion, vol. 64, pp. 131–148, Dec. 2020, doi: FakeET—An eye-tracking database to understand deepfake perception,’’
10.1016/j.inffus.2020.06.014. 2020, arXiv:2006.06961.

25512 VOLUME 10, 2022


M. S. Rana et al.: Deepfake Detection: Systematic Literature Review

[144] L. Li, J. Bao, H. Yang, D. Chen, and F. Wen, ‘‘Advancing high fidelity BEDDHU MURALI received the Ph.D. degree
identity swapping for forgery detection,’’ in Proc. IEEE/CVF Conf. Com- in aerospace engineering from Mississippi State
put. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 5074–5083. University, in 1992. He is currently an Asso-
[145] Y. Pan, X. Ge, C. Fang, and Y. Fan, ‘‘A systematic literature review of ciate Professor of computing sciences and com-
Android malware detection using static analysis,’’ IEEE Access, vol. 8, puter engineering (CSCE) at The University of
pp. 116363–116379, 2020, doi: 10.1109/ACCESS.2020.3002842. Southern Mississippi, USA. His research interests
[146] L. Li, T. F. Bissyandé, M. Papadakis, S. Rasthofer, A. Bartel, D. Octeau,
include scientific computational algorithms, high-
J. Klein, and L. Traon, ‘‘Static analysis of Android apps: A systematic
literature review,’’ Inf. Softw. Technol., vol. 88, pp. 67–95, Aug. 2017, doi: performance computing, image and video process-
10.1016/j.infsof.2017.04.001. ing, robotics, and machine learning.
[147] T. Baar, W. van Houten, and Z. Geradts, ‘‘Camera identification by
grouping images from database, based on shared noise patterns,’’ 2012,
arXiv:1207.2641.

MD SHOHEL RANA (Member, IEEE) received


the bachelor’s and master’s degrees in computer
science and engineering from Mawlana Bhashani
Science and Technology University, Bangladesh,
and the Ph.D. degree in computational science
from The University of Southern Mississippi, MS,
USA, in 2021. He is currently a Visiting Assistant
Professor of computer science at Northern Ken-
tucky University, KY, USA. His research interests
include digital image processing and computer
vision, data mining and pattern recognition, machine learning, deep learning,
cybersecurity, e-learning, distributed database, and blockchain. ANDREW H. SUNG (Member, IEEE) received
the Ph.D. degree in computer science from the
MOHAMMAD NUR NOBI (Member, IEEE) State University of New York at Stony Brook,
received the bachelor’s and master’s degrees in USA, in 1984. He is currently a Professor of com-
computer science and engineering from Mawlana puting sciences and computer engineering (CSCE)
Bhashani Science and Technology University, at The University of Southern Mississippi, USA.
Bangladesh. He is currently pursuing the Ph.D. His research interests include computational intel-
degree with the Department of Computer Sci- ligence and its applications, information security
ence, The University of Texas at San Antonio and multimedia forensics, social network anal-
(UTSA). His research interests include cybersecu- ysis, data mining and pattern recognition, and
rity, machine learning, computer vision, and med- petroleum reservoir modeling.
ical image processing.

VOLUME 10, 2022 25513

You might also like