arXiv:1811.00830v2 [cs.CR] 24 Apr 2019
Towards Adversarial Malware Detection: Lessons Learned
from PDF-based Attacks
DAVIDE MAIORCA, University of Cagliari, Italy
BATTISTA BIGGIO, University of Cagliari, Italy
GIORGIO GIACINTO, University of Cagliari, Italy
Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of
infection vectors such as documents. These infection vectors hide embedded malicious code to the victim
users, facilitating the use of social engineering techniques to infect their machines. Research showed that
machine-learning algorithms provide effective detection mechanisms against such threats, but the existence of
an arms race in adversarial settings has recently challenged such systems. In this work, we focus on malware
embedded in PDF files as a representative case of such an arms race. We start by providing a comprehensive
taxonomy of the different approaches used to generate PDF malware, and of the corresponding learning-based
detection systems. We then categorize threats specifically targeted against learning-based PDF malware
detectors, using a well-established framework in the field of adversarial machine learning. This framework
allows us to categorize known vulnerabilities of learning-based PDF malware detectors and to identify novel
attacks that may threaten such systems, along with the potential defense mechanisms that can mitigate the
impact of such threats. We conclude the paper by discussing how such findings highlight promising research
directions towards tackling the more general challenge of designing robust malware detectors in adversarial
settings.
CCS Concepts: · Security and privacy → Malware and its mitigation; · Theory of computation →
Adversarial learning;
Additional Key Words and Phrases: PDF Files, Infection Vectors, Machine Learning, Evasion Attacks, Vulnerabilities, JavaScript
ACM Reference Format:
Davide Maiorca, Battista Biggio, and Giorgio Giacinto. 2019. Towards Adversarial Malware Detection: Lessons
Learned from PDF-based Attacks. 1, 1 (April 2019), 35 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
1 INTRODUCTION
Malware for X86 (and more recently for mobile architectures) is still considered one of the top
threats in computer security. While it is common to think that the most dangerous attacks are
crafted using plain executable files (especially on Windows-based operating systems), security
reports showed that the most dangerous attacks in the wild were carried out by using infection
vectors [87]. With this term, we define non-executable files whose aim is exploiting vulnerabilities
of third-party applications to trigger download (or direct execution) of executable payloads. Using
such vectors gives attackers multiple advantages. First, they can exploit the structure of third-party
Authors’ addresses: Davide Maiorca, University of Cagliari, Piazza d’Armi, Cagliari, 09123, Italy, davide.maiorca@diee.
unica.it; Battista Biggio, University of Cagliari, Piazza d’Armi, Cagliari, 09123, Italy,
[email protected]; Giorgio
Giacinto, University of Cagliari, Piazza d’Armi, Cagliari, 09123, Italy,
[email protected].
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the
full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from
[email protected].
© 2019 Copyright held by the owner/author(s). Publication rights licensed to the Association for Computing Machinery.
XXXX-XXXX/2019/4-ART $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
, Vol. 1, No. 1, Article . Publication date: April 2019.
:2
D. Maiorca et al.
formats to conceal malicious code, making their detection significantly harder. Second, infection
vectors can be effectively used in social engineering campaigns, as victims are more prone to
receive and open documents or multimedia content. Finally, although vulnerabilities of third-party
applications are often publicly disclosed, they are not promptly patched. The absence of proper
security updates makes thus the lifespan of attacks perpetrated with infection vectors much longer.
Machine learning-based technologies have been increasingly used both in academic and industrial
environments (see e.g., [48]) to detect malware embedded in infection vectors like malicious PDF
files. Research work has demonstrated that learning-based systems could be effective at detecting
obfuscated attacks that are typically able to evade simple heuristics [23, 65, 82, 95], but the problem
is still far from being solved. Despite the significant increment of detected attacks, researchers
started questioning the reliability of learning algorithms against adversarial attacks carefully-crafted
against them [8ś10, 17, 18]. Such attacks became widely popular when researchers showed that it
was possible to evade deep learning algorithms for computer vision with adversarial examples, i.e.,
minimally-perturbed images that mislead classification [40, 88]. The same attack principles have
also been employed to craft adversarial malware samples, as first shown in [9], and subsequently
explored in [29, 42, 52, 96, 99]. Such attacks can typically perform few, fine-grained changes on
correctly detected, malicious samples to have them misclassified as legitimate. Accordingly, it
becomes possible to evade machine-learning detection in a stealthier manner, without resorting to
invasive changes like those performed through code obfuscation.
Malicious PDF files constitute the most studied infection vectors in adversarial environments [9,
20, 61ś64, 82, 83, 96, 102]. This file type was chosen for three main reasons. First, it represented
the most dangerous infection vector in the wild from 2010 to 2014 (to be subsequently replaced by
Office-based malware), and a lot of machine learning-based systems were developed to detect the
vast variety of polymorphic attacks related to such format (e.g., [65, 82, 83]). Second, the complexity
of the PDF file format allows attackers to employ various solutions to conceal code injection or
other attack strategies. Finally, it is easy to modify its structure, by for example injecting benign or
malicious material in various portions of the file. Such characteristic makes PDF files particularly
prone to be used in adversarial environments, as the effort that attackers have to carry out to create
attack samples is significantly low.
While previous surveys in the field of PDF malware analysis focused on describing the properties
of detection systems [32, 72], our work explores the topic under the perspective of adversarial
machine learning. The idea is showing how adversarial attacks have been carried out against PDF
malware detectors by exploiting the vulnerabilities of their essential components, and highlighting
how the arms race between attackers and defenders has evolved in this scenario over the last
decade. The result is two-folded: on the one hand, we highlight the current security issues that
allow attackers to deceive the current state-of-the-art algorithms; on the other hand, understanding
adversarial attacks points out new, intriguing research directions which we believe can also be
relevant for other malware detection problems.
To adequately describe PDF malware detection under adversarial environments, we organized
our work as follows. First, we describe the main attack types that can be carried out by using
PDF files. Second, we provide a detailed description of state-of-the-art PDF detectors, by primarily
focusing on their machine-learning components. Third, we show how such systems can be evaded
with different adversarial attacks. In particular, we provide a complete taxonomy of the attacks
that can be performed against learning-based detectors, by also describing how such attacks can
be concretely implemented and deployed. Finally, we overview possible solutions that have been
proposed to mitigate such attacks, thus sketching further research directions in developing novel
adversarial attacks and defenses against them.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:3
This work also aims to provide solid bases to overcome the hurdles that can be encountered
when working in adversarial environments. We firmly believe that these principles can constitute
useful guidelines when dealing with the generic task of malware detection, not only restricted
to PDF files. We claim that systems based on machine learning for malware detection should be
built by accounting for the presence of attacks carefully tailored against them, i.e., according to the
principle of security by design, which suggests to proactively model and anticipate the attacker to
build more secure systems [14, 17].
The rest of the paper is structured as follows: Section 2 provides a general overview of the
attacks carried out with the PDF file format, as well as the possible attacks in the wild that can be
carried out against PDF readers; Section 3 depicts the various detection methodologies that have
been introduced in these years, by also discussing the achieved detection performances; Section 4
provides a complete taxonomy of the adversarial attacks that can be carried out by exploiting
learning-based vulnerabilities, as well as an insight into how these attacks can be implemented;
Section 5 describes the current countermeasures adopted against adversarial threats, and sketches
possible research directions for developing new attacks and defenses. Section 6 concludes the paper
by also providing a comparison with previous works.
2 PDF MALWARE
This Section aims to provide the basics to understand how PDF malware in the wild executes its
malicious actions. To this end, we divided this Section into two parts. In the first part (Section 2.1),
we provide a comprehensive overview of the PDF file format. In the second part (Section 2.2), we
describe how PDF malware uses the characteristics of its file format to exploit vulnerabilities.
2.1
Overview of PDF files
Digital documents, albeit different concerning the way their readers parse them, share some essential
aspects. In particular, they can be typically represented as a combination of two components: a
general structure that outlines how the document contents are stored, and the file content that
describes the information that is properly visualized to the user (such as text, images, scripting
code).
General Structure. PDF (Portable Document Format) is one of the most used formats, along with
Microsoft Office, to render digital documents. It can be conceptually considered as a graph of
objects, each of them performing specific actions (e.g., displaying text, rendering images, executing
code, and so forth). The typical structure of a PDF file is showed in Figure 1, and it consists of four
parts [2, 4]:
• Header. A single line of text containing information about the PDF file version, introduced
by the marker %.
• Body. A sequence of objects that define the operations performed by the file. Such objects
can also contain compressed or uncompressed embedded data (e.g., text, images, scripting
code). Each object possesses a unique reference number, typically introduced by the sequence
number 0 obj1 , where number is the proper object number. PDF objects can be referenced
by others by using the sequence number 0 R2 , where number identifies the target object that
is referenced. Each object is ended by the endobj marker. The functionality of each object
is described by keywords (also known as name objects - highlighted in bold in the Figure),
which are typically introduced by /.
1 The
value between number and obj is called generation number, and it is typically 0. It can be different in some corner
cases, but we recommend to refer to the official documentation for more information.
2 The value between number and R is the same generation number as the original object.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:4
D. Maiorca et al.
• Cross-Reference (X-Ref) Table. A list of offsets that indicate the position of each object in
the file. Such a list gives the PDF readers precise indications on where to begin parsing each
object. The Cross-Reference Table is introduced by the marker xref, followed by a sequence
of numbers, whose last indicates the total number of objects in the file. Each line of the table
corresponds to a specific object, but only lines that end with n are related to objects that are
concretely stored in the file. It is worth noting that all the PDF readers parse only the objects
that are referenced by the X-Ref Table. Therefore, it is possible to find objects that are stored
in the file, but that lack their reference in the table.
• Trailer. A special object that describes some basic elements of the file, such as the first object
of the graph (i.e., where the PDF readers should start parsing the file information). Moreover,
it contains references to the file metadata, which are typically stored in one single object.
The keyword trailer always introduces the trailer object.
Fig. 1. PDF file structure, with examples of header, body, cross-reference table and trailer contents. Object
names (i.e., keywords) are highlighted in bold.
PDF files are parsed as follows: first, the trailer object is parsed by PDF readers to retrieve
the first object of the hierarchy. Then, each object of the PDF graph (contained in the body) is
explored and rendered by combining information contained in the X-Ref Table with the references
numbers found inside each object. Every PDF file is terminated by a %%EOF (End Of File) marker. An
interesting characteristic of PDF files is that they can also be updated without being regenerated
from scratch (although the latter is possible), with a process called versioning. When an existing
object is modified, or new objects are added to the file, a new body, X-Ref table, and trailer are
appended to the file. The new body and X-Ref table only contain information about the changes
that occurred to the document.
Objects. As previously stated, objects are typically identified by a number, and they are more
formally referred to as indirect objects. However, every element inside the body is generally regarded
as an object, even if a number does not identify it. When an object is not identified by a number
(i.e., when its part of other objects), it is called direct.
Indirect objects are typically composed of a combination of direct objects. Listing 1 reports a
typical example of PDF object.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:5
8 0 obj
<</Filter[/FlateDecode]/Length 384/Type/EmbeddedFile>>
stream
...
endstream
endobj
Listing 1. An example of PDF object. The content of the stream has not been reported for the sake of brevity.
Most of the times, indirect objects (in this case, the object is number 8) are dictionaries (enclosed
in << >>) that contain sequences of coupled direct objects. Each couple typically provides specific
information about the object. The object in the Listing contains a sequence of three couples of objects,
which decompress an embedded file (marked by the keywords /Type/EmbeddedFile) of length
384 (keyword /Length) with a FlateDecode compression filter (keywords /Filter/FlateDecode).
Objects that contain streams always feature the markers stream and endstream at the very end,
meaning that PDF objects first instruct the PDF readers about their functionality, then on the data
type they operate.
Among the various types of objects, there are some that perform actions such as executing
JavaScript code, opening embedded files, or even performing automatic actions when the file
is opened or closed. To this end, the PDF language resorts to specific name objects (keywords)
that are typically associated with actions that can have repercussions from a security perspective.
Among the others, we mention /JavaScript and /JS for executing JavaScript code; /Names,
/OpenAction, /AA for performing automatic actions; /EmbeddedFile and /F for opening embedded
files; /AcroForm to execute forms. The presence of such objects should be considered as a first hint
that somebody may perform malicious actions. However, as such objects are also widely used in
benign files, it may be quite hard to establish the maliciousness of the file by only inspecting them.
2.2
Malicious Exploitation
The majority of attacks that are carried out using documents resort to scripting codes to execute
malicious code. Therefore, after having described the general structure of PDF files, we now provide
an insight into the security issues related to the contents that can be embedded in the file. To do
so, we provide a taxonomy of the various attacks that can be perpetrated by using PDF files. In
particular, we focus on attacks targeting Adobe Reader, the most used PDF reader in the wild. The
common idea behind all attacks is that they exploit specific vulnerabilities of the Adobe Reader
components, and in particular of its plugins. Vulnerabilities can be characterized by multiple
exploitation strategies, which also depend on the targeted Reader component.
Table 1 reports a list of the major Adobe Reader vulnerabilities that have been exploited in
the wild (either with proofs-of-concept or proper malware) in the last decade, along with a brief
description of their type and exploitation strategies. Notably, we did not include variants of the
same vulnerability in the Table, and we only focused on the most representative ones. Such list has
been obtained by analyzing exploit databases, media sources, security bulletins and our own file
database retrieved from the VirusTotal service [33, 35, 41, 73, 79, 91, 98]. According to what we
represented in Table 1, there are three primary ways to perform exploitation:
• JavaScript-based. These vulnerabilities are exploited by exclusively employing JavaScript
code, and it is the most common way to perform exploitation. The attack code can be scattered
through multiple objects in the file, or it can be contained in one single object (especially in
older exploits).
, Vol. 1, No. 1, Article . Publication date: April 2019.
:6
D. Maiorca et al.
Table 1. An overview of the main exploited vulnerabilities against Adobe Reader, along with their vulnerability
and exploitation type.
Vulnerability
CVE-2008-0655
CVE-2008-2992
CVE-2009-0658
CVE-2009-0927
CVE-2009-1492
CVE-2009-1862
CVE-2009-3459
CVE-2009-3953
CVE-2009-4324
CVE-2010-0188
CVE-2010-1240
CVE-2010-1297
CVE-2010-2883
CVE-2010-2884
CVE-2010-3654
CVE-2011-0609
CVE-2011-0611
CVE-2011-2462
CVE-2011-4369
CVE-2012-0754
CVE-2013-0640
CVE-2013-2729
CVE-2014-0496
CVE-2015-3203
CVE-2016-4203
CVE-2017-16379
CVE-2018-4990
Vuln. Type
Exploitation Type
API Overflow (Collab.collectEmailInfo)
API Overflow (util.printf )
Overflow
API Overflow (Collab.getIcon)
API Overflow (getAnnots)
Flash (Memory Corruption)
Malformed Data (FlateDecode Stream)
Malformed Data (U3D)
Use-After-free (media.newPlayer)
Overflow
Launch Action
Flash (Memory Corruption)
Overflow (coolType.dll)
Flash (Memory Corruption)
Flash (Memory Corruption)
Flash (Bytecode Verification)
Flash (Memory Corruption)
Malformed U3D Data
Corrupted PRC Component
Flash (Corrupted MP4 Loading)
API Overflow
Overflow
Use-After-Free (toolButton)
API Restriction Bypass
Invalid Font
Type Confusion (IsAVIconBundleRec6)
Use-After-Free (ROP chains)
JavaScript
JavaScript
File Embedding (JBIG2)
JavaScript
JavaScript
ActionScript
JavaScript
JavaScript
JavaScript
File Embedding (TIFF)
File Embedding (EXE)
ActionScript
JavaScript
ActionScript
ActionScript
ActionScript
ActionScript
JavaScript
JavaScript
ActionScript
JavaScript
File Embedding (BMP)
JavaScript
JavaScript
File Embedding (TFF)
JavaScript
JavaScript
• ActionScript-based. These vulnerabilities exploit the capabilities of Adobe Reader of parsing
Flash (ActionScript) files, due to the proper integration between Reader and Adobe Flash
Player. ActionScript code can be used in combination to JavaScript to attain more effective
exploitation.
• File Embedding. This exploitation technique resorts to external file types, such as .bmp,
.tiff and .exe. Typically, the exploitation is triggered when specific libraries of Adobe
Reader attempt to parse such files. It is also possible to embed other PDF files: however, this
is not considered as an exploitation technique, but more as a way to conceal other attacks
(see the next sections for more details).
From this list, it can be observed that, although numerous vulnerabilities are still disclosed on
Adobe Reader, only a few have been recently exploited in the wild. Such a tiny number of exploited
vulnerabilities reflects the fact that PDF files are now less preferred as exploitation vectors by
attackers. However, things can unexpectedly change when new, dangerous vulnerabilities are
disclosed (such as CVE-2018-4990). In the following, we provide a more detailed description of the
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:7
previously mentioned exploitation techniques, by also providing concrete examples from existing
vulnerabilities.
JavaScript-Based Attacks. JavaScript-based attacks are the most used ones in PDF files due to
the massive support that the file format provided to this scripting language. In particular, Adobe
introduced specific API calls that are only supposed to be used in PDF files (their specifications are
contained in the official Adobe references [3]), and that can be exploited to execute unauthorized
code on a machine.
According to the vulnerability types contained in Table 1, multiple vulnerabilities can be exploited
through JavaScript code. Such vulnerabilities can be organized in multiple categories, which we
describe in the following:
• API-based Overflow. This vulnerability type typically exploits wrong argument parsing for
specific API calls that belong to PDF parsing library, thus allowing attackers to perform buffer
overflow or ROP-based attacks3 to complete the exploit. Typical examples of vulnerable APIs
are util.printf and Collab.getIcon, which were among the first ones to be exploited
when PDF attacks started to be massively used.
• Use-After-Free. This vulnerability type is based on accessing memory areas that have been
previously freed (and not re-initialized). Normally, such behavior would make the program
crash. However, it could allow arbitrary code execution in a vulnerable program.
• Malformed Data. This vulnerability type is triggered by compressed malformed data that get
decompressed at runtime. Such data is typically stored in streams.
• Type Confusion. This vulnerability type may affect functions that receive void pointer as
parameters. Typically, the pointer type is inferred by checking some bytes of the pointed
object. However, such bytes can be manipulated to make even an invalid pointer type to be
recognized as valid, thus allowing arbitrary code execution.
A large number of vulnerabilities and exploitation types allow attackers to carry out malicious
actions that are often not easy to detect. The evolution of the employed exploitation techniques
becomes particularly evident if we observe the differences between the first API overflows (e.g.,
CVE-2008-0655) and the most recent attack strategies. We expect that this trend will further evolve
with more sophisticated techniques.
ActionScript-Based Attacks. As PDF files can visualize Flash content through Adobe Reader
support to Adobe Flash technologies, one way to exploit Reader is triggering vulnerabilities of
its Flash component by embedding malicious ShockWave Flash (SWF) files and ActionScript
code. Normally, such code is used in combination with JavaScript: while executing ActionScript
triggers the vulnerability itself, the rest of JavaScript code carries out and finalizes the exploitation.
This exploitation technique was particularly popular in 2010 and 2011.
In a similar way to what we described for JavaScript, multiple vulnerabilities can be exploited
by using ActionScript. In the following, we describe the prominent ones.
• Memory Corruption. This vulnerability occurs when specific pointer values in memory get
corrupted in a way that they point to other memory areas, controlled by the attacker. It is
the most used way to exploit Flash components.
• ByteCode Verification. This vulnerability allows attackers to execute code in uninitialized
areas of memory.
• Corrupted File Loading. This vulnerability is triggered when parsing specific, corrupted video
files.
3 Return
Oriented Programming, an exploiting strategy that leverages return instructions to create shellcodes.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:8
D. Maiorca et al.
Generally speaking, vulnerabilities that affect the Flash components of Adobe Reader are more
complex to exploit than others. This is because two types of scripting code must be executed,
and exploitation is typically carried out in two stages (first - ActionScript execution, then JavaScript code). For this reason, attackers typically prefer exploits that are simpler to be carried
out. Moreover, Flash-based technologies will be dismissed in some years, giving attackers further
reasons not to develop further exploits.
File Embedding. This vulnerability exploits the presence of embedded, malformed content inside
the PDF file. Typically, decoding such content leads to automatic memory spraying, which can be
further exploited to execute code. The file contents that are mostly used for such attacks are the
ones related to images (such as BMP, TIFF) or fonts (such as TFF). In other cases, such as the direct
execution of EXE payloads, the content is not necessarily malformed, but simply directly executed
(this also avoid to execute malicious JavaScript code to exploit the vulnerability further). The
execution of EXE payloads can also lead to the generation of additional, malicious payloads (such
as VBS), whose final goal is dropping the final piece of malware.
3 MACHINE LEARNING FOR PDF MALWARE DETECTION
Machine learning is nowadays widely used to detect document-based infection vectors. In particular,
concerning PDF files, multiple detectors were developed in the last decade that implemented such
technology. Therefore, the aim of this Section is providing an overview of the characteristics of
such detectors. However, as this survey focuses on the implications of adversarial attacks against
machine learning systems, this Section only concerns systems that employ supervised machine
learning to perform detection, meaning that we will not discuss PDF malware detectors that employ
rule-based or non-supervised approaches (e.g. [55, 59, 78, 84, 93, 94, 100, 101]). For a more detailed
description of such systems, we refer the reader to more general purpose surveys [32, 72].
Learning-based PDF malware detection
malicious
Feature
extraction
Pre-processing
1
2
Classifier
3
legitimate
PDF file
Training
Fig. 2. Graphical architecture of a learning-based PDF malware detection tool.
The primary goal of machine-learning detectors for malicious documents is discriminating
between benign and malicious files. They can operate by analyzing and classifying information
retrieved either from the structure or the content of the document. More specifically, all systems
aimed to detect malicious PDF files share the same basic structure (reported in Figure 2), which is
composed of three main components [63]:
(1) Pre-Processing. This component parses PDF files by isolating the data that are crucial for
detection. For example, it can extract JavaScript or ActionScript code, select specific
keywords or metadata, and so forth.
(2) Feature Extraction. This component operates on the information extracted during the
pre-processing phase, by converting it to a vector of numbers. Such vector can represent, for
example, the presence of specific keywords or API calls, or also the occurrences of certain
elements in the file.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:9
Table 2. An overview of the main characteristics of current PDF malware detectors.
Detector
Tool
Year
Shafiq et al. [80]
N-Gram
2008
Tabish et al. [89]
N-Gram
2009
Wepawet
2010
Cova et al. [24]
Laskov and Šrndić [54]
PJScan
2011
Maiorca et al. [65]
Slayer
2012
PDFRate-v1 2012
Smutz and Stavrou [82]
Hidost
2013
Šrndić and Laskov [85, 95]
Corona et al. [23]
Lux0R
2014
Maiorca et al. [62]
Slayer NEO 2015
PDFRate-v2 2016
Smutz and Stavrou [83]
Pre-processing
Features
Static
Custom
RAW Bytes
Static
Custom
RAW Bytes
Dynamic
JSand
JS-based
Static
Poppler
JS-based
Static
PDFID
Structural
Static
Custom
Structural
Static
Poppler
Structural
Dynamic
PhoneyPDF
JS-based
Static
PeePDF+Origami Structural
Static
Custom
Structural
Classifier
Markov
Decision Trees
Bayesian
SVM
Random Forest
Random Forest
Random Forest
Random Forest
Adaboost
Classifier Ensemble
(3) Classifier. The proper learning algorithm. Its parameters are first tuned during the training
phase to reduce overfitting and guarantee the highest flexibility against polymorphic variants.
According to the three components described above, Table 2 provides a general overview of
machine learning-based PDF detectors that have been released since 2008.
Note how each system employs a combination of different components types (e.g., a specific
pre-processing module with a specific learning algorithm or feature extractor). Therefore, we
structure the remaining of the Section by describing how each component can be characterized,
along with its strengths and weaknesses (which will be crucial when such components are observed
from the perspective of adversarial machine learning).
3.1
Pre-processing
As reported in the previous Sections, the pre-processing phase is crucial to select data that are
used for detection. Table 2 showed that there are two major types of pre-processing: static and
dynamic. Static preprocessing analyzes the PDF file without executing it (or its contents). Dynamic
pre-processing employs techniques such as sandboxing or instrumentation to execute either the
PDF file or its JavaScript content. While static analysis is considerably faster and does not require
many computational resources, dynamic analysis is much more effective at detecting obfuscated
samples, meaning that certain obfuscation routines are almost impossible to be de-obfuscated
automatically (in many cases there are multiple stages of obfuscation). As employing a sandboxed
Adobe Reader is quite expensive computationally and not practical for the end-user, most detection
systems rely on static parsing.
When describing pre-processing tools, we generally divide them into two categories, which we
list in the following:
• Third-party Processing. This category includes parsers that have been already developed
and tested, and that do not directly depend on detection tools, thus being included as external
modules in the system. The main advantage of employing third-party parsers is that they
include many functionalities that are less prone (although not immune, as we will discuss
more in detail later) to bugs. However, they can also embed unneeded functionalities and can
be quite heavy computationally.
• Custom Processing. This category includes parsers that have been written from scratch, in
order to adapt to the information required from the detection system. As such parsers are
tailored to the operations of the detectors, their functionalities are rather limited and much
prone to bugs, as typically they have not been extensively tested.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:10
D. Maiorca et al.
Table 3. An overview of third-party parsers employed by machine learning-based PDF malware detectors.
Each parser has been evaluated concerning three key components of the file. When the parser can completely
analyze that specific component, we use the term Complete; when only certain operations (in brackets) can
be performed, we employ the term Partial; finally, when no support is provided, we use the term None.
Parser
PDF Structure
Origami [53]
Complete
None
JSand [24]
Partial (Key Analysis)
PDFId [86]
Partial (Obj. Analysis)
PeePDF [34]
None
PhoneyPDF [90]
Poppler [37]
Complete
JavaScript
Embedded Files
Partial (Code Analysis)
Complete
None
Partial (Code Analysis)
Complete
Partial (Extraction)
Complete
Partial (Analysis)
None
Complete
None
Complete
Table 2 clearly shows that the favorite developers choice is relying on already existent tools,
mostly because of their resilience to bugs and capabilities to adapt to most file variants. For this
reason, in the following, we provide a more extensive description of third-party parsers, which is
summarized in Table 3.
The Table is organized as follows. Each parser analyzes three main elements of the PDF file:
the PDF structure, the embedded JavaScript code and the presence of embedded files of any type
(including further PDF files). Such elements are analyzed with three degrees of complexity: Complete,
Partial or None. The impact of such complexity degrees depends on the analyzed PDF element. In
the following, we provide a more detailed description of each degree of complexity for each PDF
element:
• PDF Structure. It refers to all elements of the PDF structure that are not related to embedded
code, such as direct or indirect objects, metadata, and so on. When parsers completely support
PDF Structure, it means that they can not only extract and analyze object and metadata, but
also that they can perform structural changes to the file, such as object injection or metadata
manipulation. When the support is partial, we typically refer to parsers that are only able to
analyze objects, but not to manipulate them. When the support is set to None, it means that
the PDF Structure is not analyzed at all. Poppler [37] and Origami are the only parsers that
provide the possibility of properly injecting and manipulating content inside the file.
• JavaScript. It refers to the embedded JavaScript code inside the file. When the support
to JavaScript is complete, it means that the code can be either statically and dynamically
analyzed (to overcome de-obfuscation), for example through instrumentation. When the
support is partial, it means that the code can be only statically analyzed (or even only
extracted, as it happens in Poppler [37]), leading to some limitations when heavy obfuscation
is employed. Finally, when no support is provided, the JavaScript code is not even extracted.
JSand and PhoneyPDF [24, 90] are the only parsers that completely support JavaScript
instrumentation and execution.
• Embedded Files. It refers to the capability of parsers to extract or inject embedded files
(such as executable, office documents, or even other PDF files). When the support is complete,
parsers can either extract or inject embedded files into the original PDF. When the support
is partial, embedded files can only be extracted (or analyzed). Finally, when no support is
provided, it means that embedded files cannot be extracted. Origami, PeePDF and Poppler [34,
37, 53] support extraction and analysis of embedded contents.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:11
From the description provided by Table 3, it can be inferred that no parsers can extract or
manipulate all elements of the PDF file, although some of them allow for more functionalities. For
this reason, the choice of the parser to be used is related to the type of information that is needed by
the learning algorithm to perform its functionality. In the following, we provide a brief description
of each third-party parser.
• Origami [53]. This parser, entirely written in Ruby, allows users to navigate the object
structure of PDF files, to craft malicious files by injecting code or other objects, to decompress
and decrypt streams, and so forth. Moreover, it embeds popular information to recognize
JavaScript API-based vulnerabilities (see Section 2.2).
• JSand [24]. This parser was part of the Wepawet engine to perform dynamic analysis of PDF
files. It could execute the embedded JavaScript code to extract API calls and de-obfuscate
code. Moreover, it could inspect embedded executables to reveal the presence of additional
attacks. Unfortunately, the Wepawet service is currently not available; hence it is not possible
to test JSand anymore.
• PDFId [86]. This parser has been developed to extract the PDF name objects (see Section 2).
It does not perform additional analysis on embedded code or files.
• PeePDF [34]. This parser, entirely written in Python, can perform a complete analysis of
the PDF file structure (without being able to inject objects). It allows to inject and extract
embedded files, and it provides a basic static analysis of JavaScript code.
• PhoneyPDF [90]. This parser, entirely written in Python, performs dynamic analysis of
embedded JavaScript code through instrumentation. More specifically, it emulates the
execution of the JavaScript code embedded in a PDF file, in order to extract API calls that
are strictly related to the PDF execution. This parser does not perform any structural analysis
or embedding files extraction.
• Poppler [37]. Poppler is a C++ library that is used by popular, open-source software such
as X-Pdf to render the contents of the file. For this reason, the library features complete
support to PDF Structure parsing and managing, as well as JavaScript code extraction and
injection of embedded files.
Concerning custom parsers, it is important to observe that we could not access the tools that
adopted such parsers, as their source was not publicly released. Hence, we could only refer to what
has been stated in the released papers [80, 82, 83, 89]. While raw bytes parsers [80, 89] only focused
on extracting byte sequences from the PDF, the parser adopted by PDFRate [82, 83] analyzed and
extracted the object structure of the PDF file, with a particular focus on PDF metadata. However,
the latter parser has been used as a test-bench for adversarial attacks, and researchers proved it
could be easily evaded [20, 96] (see the next Sections).
3.2
Feature Extraction
Feature extraction is essential to PDF malware classification. In this phase, data obtained from the
pre-processing phase are further parsed and transformed into vectors of numbers of a fixed size.
Table 4 provides an overview of the feature types that are used by each PDF detector. We can divide
the employed feature types into three categories:
• Structural. These features are related to the PDF structure, and most concern the presence
or the occurrence of specific keywords (name objects) in the file. Others include metadata or
the presence/count of indirect objects or streams.
• JS-Based. These features are related to the structure of JavaScript code. Most of them
concern lexical characteristics of the code (for example, the number of specific operators in
, Vol. 1, No. 1, Article . Publication date: April 2019.
:12
D. Maiorca et al.
Table 4. An overview of the feature types selected by each machine learning-based PDF detector. Each field
of the table further specifies the type of feature employed by each detector. Worth noting, when a specific
feature type is not used by the detector, we put a x on that field.
Detector
Tool
Shafiq et al. [80]
N-Gram
Tabish et al. [89]
N-Gram
Cova et al. [24]
Wepawet
PJScan
Laskov and Šrndić [54]
Maiorca et al. [65]
Slayer
PDFRate-v1
Smutz and Stavrou [82]
Šrndić and Laskov [85, 95]
Hidost
Corona et al. [23]
Lux0R
Slayer NEO
Maiorca et al. [61, 62]
Smutz and Stavrou [83]
PDFRate-v2
Year
Structural
JS-Based
2008
x
x
2009
x
x
2010
x
Execution-Based
2011
x
Lexical
2012
Keywords
x
2012
Metadata
x
2013 Key. Sequence
x
2014
x
API-Based
2015
Keywords
API-Based
2016
Metadata
x
RAW Bytes
N-Grams
N-Grams
x
x
x
x
x
x
x
x
the file), used API calls, or information obtained from the code behavior (e.g., when shellcodes
are embedded in the attack).
• Raw Bytes. This category includes features that concern sequences of bytes taken as n-grams
(i.e., groups of n-bytes, where n is typically a small integer).
Most PDF detectors only implement feature extraction methodologies that include only one type
of feature. Byte-level features were among the first ones employed to solve the problem of PDF
malware detection, and the very first works in the field mostly adopted them [80, 89]. Typically,
these features are represented by simple sequences of bytes taken in groups of n, where n is a
very small integer. The reason for such a small number is because the feature space can explode
quite easily. Using 1-gram means a total of 256 features, while a 2-gram means 65536 features. For
this reason, this solution has not been considered very practical on standard machine learning
models. Moreover, byte-level features do not typically convey explainable information on the file
characteristics.
JavaScript-based features have been mostly adopted by Wepawet,PJScan and Lux0R [23, 24,
54]. The general idea behind using these features is to isolate functions that perform dangerous
operations, as well as detecting the presence of the obfuscated code that is typically associated
with malicious behaviors. Wepawet [24] extracted information obtained from the code execution in
an emulated environment. Such information is mostly related to code behavior, such as the type
of parameters that are passed to specific methods, the sequences of suspicious API calls during
execution, the number of bytes that are allocated for string operation (which may be a hint of heap
spraying), and so forth. PJScan resorted to lexical information extracted from the JavaScript code
itself, such as the count of specific operators (such as + or ( )) that are known for being abused
when obfuscated code. Moreover, it performs additional checks on the length of strings to point out
the presence of suspicious exploiting techniques (such as buffer overflow or heap spraying). Finally,
Lux0R exclusively operates on JavaScript API calls that belong to the Adobe Reader specifications.
In particular, each call is evaluated only once (after being extracted during the pre-processing
phase), leading to a binary feature vector of calls. While such features are excellent to analyze and
detect attacks that carry JavaScript code, they cannot represent other types of attacks, such as
the ones that involve the use of embedded files.
Structural approaches have been considerably used in recent years. The main idea, in this case,
was trying to address all possible attacks reported in Section 2.2, by using the most general approach
possible. This idea also revolves around the concept that malware samples are structurally different
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:13
from benign ones. For example, they typically feature fewer pages than benign samples, and the
representation of the content is significantly scarcer. The first approaches (adopted by Slayer and
by its extension Slayer NEO - which also employed a very reduced number of JavaScript-based
features related to known vulnerable APIs) [61, 62, 65] focused on counting the occurrences of
specific keywords in the file. The keywords were regarded as relevant when they appeared at least
once on enough files in the training set. Hidost [85, 95] evolved these approaches by employing
sequences of keywords. More specifically, each feature was extracted by walking the PDF tree
and evaluating how keywords were sequentially referred. The number of features was limited to
1000, as using all the possible features would have led to an explosion of the algorithm in terms
of complexity. Finally, PDFRate [82, 83] focused on more general-purpose features that included,
among the others, the number of indirect objects, the properties of the stream contents (e.g., the
number of upper-case and lower-case characters) and so forth. Moreover, the approach also makes
use of information obtained from metadata to identify suspicious behaviors (for example, when
popular tools do not generate the PDF file). Albeit structural approaches proved to be effective
at detecting even ActionScript-based attacks, they exhibit some limitations that will be better
described in the next Sections.
3.3
Learning and Classification
Feature extraction can be regarded as the process of mapping an input PDF file z ∈ Z, being Z the
abstract space containing all PDF files, onto a vector space X ⊆ Rd . In the feature space X, each file
is represented as a numerical vector x of d elements. From a mathematical perspective, we can thus
characterize the feature extraction process as a mapping function ϕ : Z 7→ X that maps PDF files
onto a vector space. Thanks to this abstraction, it is possible to use any learning algorithm to perform
classification of PDF documents. All PDF malware detectors resort to supervised approaches, i.e.,
they require the labels of the training samples to be known. More specifically, a learning algorithm
n , labeled as either legitimate
is trained to recognize a set of known training samples D = (x i , yi )i=1
4
(y = −1) or malicious (y = +1). During this process, the parameters of the learning algorithm
(if any) are typically set according to some given performance requirements. After training, the
learning algorithm provides a classification function f : X 7→ R that can be used to assign a
real-valued score to each input sample x at test time. Without loss of generality, we can assume
here that x is classified as malicious (positive) if f (x) ≥ 0, and legitimate (negative) otherwise.
Notably, the appropriate learning algorithm is selected depending on the features that are used for
classification and on the training data at hand. Decision trees have been used by most PDF malware
detectors, and proved to be very effective to address this problem [23, 62, 65, 82, 83, 89, 95]. In
particular, ensemble models such as Random Forests or Boosting showed very high accuracy under
limited false positives. The underlying reason is that such classifiers well adapt to heterogeneous
information, and in particular to discrete feature values such as counts. However, depending on the
feature types, other solutions may be adopted. For example, PJScan [54] resorts to Support Vector
Machines (SVMs), Wepawet [24] to Bayesian classifiers, and Shafiq et al. [80] to Markov models
(to deal with n-gram-based feature representations). Nevertheless, tree-based classifiers generally
reported better performances at detecting PDF malware (see Section 3.4).
3.4
Detection Results
In the following, Table 5 presents the results attained on PDF malware detection by the most
popular, machine learning-based systems in the wild. Notably, the goal of the Table is not to show
which system performs best but to provide indications on how such detectors generally cope
4 Without
loss of generality, we assume here that the malicious (legitimate) class is assigned the positive (negative) label.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:14
D. Maiorca et al.
with PDF attacks in the wild. Direct comparison among the performances attained by all systems
would not be fair, considering that each system was trained and tested on different samples, with
different training/test splits and classifiers. Therefore, in order to build a Table that is coherent and
meaningful, we followed these guidelines:
• We considered the results attained by systems that exclusively detected PDF malware. Therefore, we ruled out [24, 80, 89], as their datasets also included other malware types besides
PDF.
• Our Table includes the overall number of malicious and benign samples, the percentage of the
dataset used for training the system, the True Positive (TP) rate attained in correspondence
of the relative False Positive (FP) rate (TP@FP), and the relative F1 score. TP@FP is among
the most used performance measure in malware analysis, and it is very useful to indicate the
performances of the systems at specific FP values. Keeping a low FP rate guarantees proper
system usability (too many false alarms would disrupt the overall user experience). Note that
we referred to values that were explicitly stated in each original work (we did not consider
any cross-analysis).
• Notably, many papers adopted significantly imbalanced datasets in their analysis. For this
reason, we used F1 as an overall measure that considers the distribution of the data as a
crucial element to measure performance. We point out that systems with higher F1-score
are not necessarily better than the others. We included this evaluation to provide the reader
with another perspective from which evaluating the results.
Notably, these choices have been driven by the heterogeneous nature of the examined works. In
particular, many of them did not report precise numbers of employed benign and malicious test
samples, but only the overall test-set percentage concerning a precise number of malicious and
benign samples. For this reason, when calculating the F1 scores for each system, we assumed that
the test percentages were equally applied for benign and malicious samples.
Some of the examined works reported multiple results attained by changing the features, classification parameters, and data distribution [54, 82, 85, 95]. Hence, for each system we focused on the
following information (the reader may check the original works for more details):
• PJScan [54]. We considered as malicious the files that were regarded as detected in the
original paper, and as benign the files that were regarded as undetected. We reported the
performances obtained with native tokens and on JavaScript files only, as PJScan does not
make any analysis of non-JavaScript files.
• PDFRate-v1 [82]. We reported the performances related to the lowest number of false
positives that are stated in the original work.
• Hidost [85, 95]. We reported the result attained by the work in [95], as it clearly states the
attained TP and FP percentages.
• PDFRate-v2 [83]. As performances were stated in the paper by considering multiple thresholds for classification, we report the results attained at 0.5 threshold. The choice was made to
obtain false positives values that were similar to the ones attained by the other systems.
The attained results show some interesting trends. First and foremost, almost all systems attained
very high accuracy at detecting PDF malware with low false positives. Such positive results mean
that various information (feature) types can be equally effective at detecting PDF malware.
Second, there is a consistent F1-score difference between Slayer [65] and its evolved Slayer
NEO [62] version, and between PDFRate-v1 [82] and PDFRate-v2 [83], where the attained F1-scores
decreased in the most recent versions of the tools. In particular, we observe that this decrease is due
to a higher number of false positives. Notably, this effect is particularly evident on PDFRate-v2,
where imbalanced datasets significantly penalize the F1 score when false positives increase. Finally,
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:15
Table 5. Results attained at PDF malware detection by the most popular, machine learning-based detectors.
The Table reports the overall malicious and benign samples, the training set percentages, the True Positives
(TP) at False Positives (FP) rate, and the overall F1 score.
Detector
Laskov and Šrndić [54]
Maiorca et al. [65]
Smutz and Stavrou [82]
Šrndić and Laskov [85, 95]
Corona et al. [23]
Maiorca et al. [61, 62]
Smutz and Stavrou [83]
Tool
Year
PJScan
Slayer
PDFRate-v1
Hidost
Lux0R
Slayer NEO
PDFRate-v2
2011
2012
2012
2013
2014
2015
2016
Mal. Samples Ben. Samples Train. Perc. (%)
15,279
11,157
5,297
82.142
12,548
11,138
5,297
960
9989
104,793
576,621
5,234
9,890
104,793
50
57
9.1
33.7
70
57
9.1
TP@FP (%)
F1
[email protected] 0.832
[email protected] 0.989
[email protected] 0.801
[email protected] 0.825
[email protected] 0.986
[email protected] 0.969
[email protected] 0.667
we observe that other works used most of the dataset for training [23, 61, 62, 65], which means
that such systems would require more training data to perform classification correctly.
4 ADVERSARIAL ATTACKS AGAINST PDF MALWARE DETECTORS
In this Section, we start by formalizing a threat model for PDF malware detection, inspired from
previous work in the area of adversarial machine learning, and then we will use it to categorize
existing and potentially-new adversarial attacks against such systems.
As highlighted in a recent survey [17], the first attempts to formalize adversarial attacks against
learning algorithms date back to the decade 2004-2014 [7, 8, 10, 13, 14, 16, 26, 39, 45, 49, 58], prior
to the recent discovery of adversarial examples against deep neural networks [40, 88]. PDF malware
has provided one of the major case studies in the literature of adversarial machine learning over
these years, as its inherent structure allows for fine-grained modifications that well-adapt to how
adversarial attacks are typically performed. As we will discuss in the remainder of this paper, this is
true in particular for evasion attacks, i.e., attacks in which PDF malware is manipulated at test time
to evade detection. Preliminary attempts in crafting evasive attacks against PDF malware detectors
were first described by Smutz and Stavrou [82], and subsequently by Šrndić and Laskov [95],
even though they were based on heuristic strategies that were able to mislead linear classification
algorithms successfully. To our knowledge, the very first work proposing optimization-based
evasion attacks against PDF malware detectors is the work by Biggio et al. [9]. In that work, the
authors were able to show that even nonlinear models were vulnerable to adversarial PDF malware,
conversely to what envisioned in [95]. The attack was done by selecting the feasible manipulations
to be performed on each malware sample via a gradient-based optimization procedure, in which
the attacker aimed to minimize the classifier’s prediction confidence on the malicious class (i.e.,
to get maximum-confidence predictions on the opposite class, namely, the benign one). Worth
remarking, the gradient-based procedure described in that paper has been the first to demonstrate
the vulnerability of learning algorithms to optimization-based evasion attacks, even before the
discovery of adversarial examples against deep networks [40, 88].
4.1
Threat Modeling and Categorization of Adversarial Attacks
The threat model proposed in this Section aims to provide a unified treatment of both attacks
developed in the area of adversarial machine learning and attacks developed specifically against
PDF malware detectors. As we will see, this does not only enable us to categorize existing attacks
under a consistent framework, clarifying the (sometimes implicit) assumptions behind each of them.
The proposed threat model will also help us to identify new potential attacks that may threaten
, Vol. 1, No. 1, Article . Publication date: April 2019.
:16
D. Maiorca et al.
PDF malware detectors in the near future, as well as novel research directions for improving the
security of such systems, inspired by recent findings in the area of adversarial machine learning.
Leveraging the taxonomy of adversarial attacks initially provided by Barreno et al. [7, 8], and
subsequently expanded by Huang et al. [45], Biggio and Roli [17] have recently provided a unified
threat model that categorizes adversarial attacks based on defining the attacker’s goal, her knowledge
of the target system, and her capability of manipulating the input data.5 In the following, we discuss
how to adapt this threat model to the specific case of PDF malware detectors.
4.1.1 Attacker’s Goal. The attacker’s goal is defined based on the desired security violation.
When speaking about systems’ security, an attacker can cause integrity, availability or even
privacy violations. Violating system integrity means having malware samples undetected, without
compromising normal system operation for legitimate users. An example of integrity violation is
when benign PDF files are injected with a payload that is undetected by anti-malware engines. In
this way, the user still visualizes the file content, but other operations still occur in the background.
Availability is compromised when many benign (but also malware) samples are misclassified,
causing a denial of service for legitimate users effectively. This situation may occur on PDF files
when the attacker injects multiple scripting (benign) instructions to trigger multiple fake alerts
due to the presence of codes (which cannot be, alone, indicators of maliciousness). Finally, privacy
is violated if the system may leak confidential information about its users, the classification model
used, and even the training data used to learn it. For example, an attacker can incrementally change
some characteristics of the PDF file (for example, by adding text, using fonts, adding scripting
codes, and so forth) and send them to the target classifier to see how it reacts to such changes.
4.1.2 Attacker’s Knowledge. The attacker can have different levels of knowledge of the targeted
system, including: (i) the training data D, consisting of n training samples and their labels, i.e.,
(x i , yi )n ; (ii) the feature set X ⊆ Rd , which is strongly related to the PDF detector components
depicted in Figure 2 and Table 2 (see Section 3). More specifically, the attacker may know which
pre-processing and feature extraction algorithms are employed, along with the extracted feature
types; (iii) the classification function f : X 7→ R (to be compared against zero for classification),
along with the objective function minimized during training (if any), its hyperparameters and,
potentially, even the classifier parameters learned after training. The attacker’s knowledge can be
conceptually represented in an abstract space Θ, whose elements correspond to the aforementioned
components (i)-(iii) as θ = (D, X, f ). Depending on the assumptions made on (i)-(iii), there may
be different attack scenarios, described in the following, and compactly summarized in Table 6.
Perfect-Knowledge (PK) White-Box Attacks. In this scenario, we assume that the attacker
knows everything about the target system, i.e., θ PK = (D, X, f ). Even though this may rarely
occur in practical settings, this scenario is useful to provide an upper bound on the performance
degradation incurred by the system under attack, and to understand how far from the worst case
more realistic evaluations are.
Limited-Knowledge (LK) Gray-Box Attacks. This category of attacks, in general, assume
that the attacker knows the feature representation X, but she does not have full knowledge of the
training data D and the classification function f . In particular, it is often assumed that the attacker
can collect a surrogate data set D̂ resembling that used to train the target classifier from a similar
source (e.g., a public repository).6 Regarding the classification function f , the attacker may know
the type of learning algorithm used (e.g., the fact that the classifier is a linear SVM) and, ideally, its
hyperparameters (e.g., the value of the regularization parameter C used to learn it), although this
5 We
6 We
refer to the attacker as feminine here due to the popular role impersonated by Eve (or Carol) in cryptography.
use here the hat symbol to denote limited knowledge of a given component.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:17
Table 6. An overview of the knowledge levels held by an attacker, according to the elements of the knowledge
space. When a component of the space is fully known, we represent it with a checkmark (✓), while we use a
x when the element is partially known or unknown.
Knowledge Level
D
X
f
White box (PK)
Gray box (LK)
Black box (ZK)
✓
x
x
✓
✓
x
✓
x
x
may not be strictly required. However, the attacker is assumed not to know the exact classifier’s
trained parameters (e.g., the weights of the linear SVM after training), but she can potentially get
feedback from the classifier about its decisions and labels. Under these assumptions, the attacker
can estimate the classification function from D̂, by training a surrogate classifier fˆ. We thus denote
this attack scenario with θ LK = (D̂, X, fˆ). Notably, these attacks may also include the case in which
the attacker knows the trained classifier f (white-box attacks), but optimizing the attack samples
against the target function f may be not effective. The reason is that, for certain configurations
of the classifier’s hyperparameters, the objective function in the white-box case may become too
noisy and difficult to optimize with gradient-based procedures, due to the presence of many poor
local minima or null gradients (a phenomenon known as gradient obfuscation [5, 74]). Therefore, it
is preferable for the attacker to optimize the attack samples against a surrogate classifier with a
smoother objective function and then test them against the target one. This is a common procedure
also used to evaluate the transferability of attacks [9, 30, 31, 56, 74].
Zero-Knowledge (ZK) Black-Box Attacks. The term zero knowledge is typically used in literature
to indicate the possibility that the attacker can query the system to obtain feedback on the labels
and the provided score, and optimize the attack samples in a black-box fashion [21, 27, 74, 92, 102].
However, it should be clarified that the attacker still has some minimal knowledge of the system. For
example, she knows that the classifier has been designed to perform specific tasks (e.g., detecting
PDF files), and she has an idea of what kind of transformations should be made on the samples to
attempt evasion. Hence, some details of the feature representation are still known (e.g., the employed
feature types, whether static or dynamic analysis is performed, etc.). The same considerations can
be done about knowledge of the training data; e.g., it is obvious that a system designed to recognize
specific PDF malware has been trained with benign and malicious PDF files. Thus, in agreement
with Biggio and Roli [17], we characterize this setting as θ ZK = (D̂, X̂, fˆ). Note that using surrogate
classifiers is not strictly necessary here [21, 27, 92, 102]; however, it is possible to learn a surrogate
classifier (potentially on a different feature representation) and check whether the crafted attack
samples transfer to the target classifier. Feedback from classifier’s decisions on specifically-crafted
query samples can be also used to refine and update the surrogate model [74].
4.1.3 Attacker’s Capability. The attacker’s capability of manipulating the input data is defined in
terms of the so-called attack influence, as well as by some data manipulation constraints. The attack
influence defines whether the attacker can only manipulate test data (exploratory influence), or also
the training data (causative influence). The latter is possible, e.g., if the system is retrained online
using data collected during operation, which can be manipulated by the attacker [8, 14, 17, 45].
Depending on whether the attack is staged at training or test time, different data manipulation
constraints can be defined. These define the changes that can be concretely performed by the
attacker to evade the system. For example, to evade malware detection at test time, malicious code
must be modified without compromising its intrusive functionality. This strategy may be employed
, Vol. 1, No. 1, Article . Publication date: April 2019.
:18
D. Maiorca et al.
against systems based on static code analysis, by injecting instructions or code that will never be
executed [9, 29, 42, 97]. For training-time attacks, instead, the constraints typically impose that
the attacker can only control a small fraction of the training set [8, 17, 45, 46, 49]. In both cases, as
discussed in [17], the attacker’s capability can be formalized in terms of mathematical constraints
along with the optimization problem defined to craft the attack samples.
4.1.4 Summary of Attacks. We describe here the potential attacks that can be perpetrated against
machine-learning algorithms, according to the assumptions made on the attacker’s goal and on
her capability of manipulating the input data [17]. Table 7 introduces the set of adversarial attacks
that have been considered to date. Notably, each of these attacks can be performed according to
different levels of knowledge, as described in Section 4.1.2.
Table 7. An overview of adversarial attacks against learning systems, adapted from [17].
Attacker’s Capability
Attacker’s Goal
Availability
Integrity
Privacy
Test data
Evasion
-
Model Extraction/Stealing
Model Inversion
Membership Inference
Training data
Poisoning (Integrity)
Backdoor
Poisoning (Availability)
-
Evasion Attacks. In this setting, the attacker attempts to manipulate test samples to have them
misclassified as desired [9]. Evasion attacks are also commonly referred to as adversarial examples,
especially when staged against deep-learning algorithms for image classification [17, 88]. These
attacks exploit specific weaknesses of a previously-trained model, and they have been widely used
to show test-time vulnerabilities of learning-based malware detectors.
Poisoning Attacks. Poisoning attacks target the training stage of the classifier. The attacker
intentionally injects wrongly-labeled samples into the training set, aiming to decrease the detection
capabilities of the classifier. If the attack aims to indiscriminately increase the classification error
at test time, causing a denial of service to legitimate system users, it is referred to as a poisoning
availability attack [7, 8, 11, 15, 17, 45, 46, 66, 69]. Conversely, if the attack is targeted to only have
few samples misclassified at test time, then it is named as a poisoning integrity attack [7, 8, 11,
15, 17, 45, 46, 49, 50, 69]. We also include backdoor attacks into this category, as they also aim to
cause specific misclassifications at test time by compromising the training process of the learning
algorithm [22, 43, 47, 57]. Their underlying idea is to compromise a model during the training phase
or, more generally, at design time (this may include, e.g., also modifications to the architecture of
a deep network by addition of specific layers or neurons), with the goal of enforcing the model
to classify backdoored samples at test time as desired by the attacker. Backdoored samples may
include malware files with a specific signature or images with a specific subset of pixels set to
given values. Once the model recognizes this specific signature (i.e., the backdoor), it should output
the desired classification. A popular example is the stop sign with a yellow sticker attached on
it (i.e., the signature used to activate the backdoor), which is misclassified as a speed limit sign
by the backdoored deep network considered in [43]. The overall intuition is that these vulnerable
models can then be released to the public, to be used as open-source pre-trained models in other
open-source tools or even commercial products, and consequently make the latter also vulnerable
to the same backdoor attacks.
Privacy Attacks. Privacy-related attacks aim to steal information about unknown models for
which the attacker is given black-box query access. In these attacks, the attacker sends specific
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:19
test samples against the target model, with the aim of obtaining information about: (i) the model
itself, via model extraction (or model stealing) attacks [92]; or (ii) the data used to train it, via model
inversion attacks (to reconstruct the training samples) [36] or membership inference attacks (to
evaluate whether a given sample has been used to train the target model or not) [70, 81].
In the remainder of this manuscript, we provide detailed insights into each attack concerning
the problem of PDF malware detection.
4.2
Evasion Attacks against PDF Malware Detectors
Research work in attacking PDF malware detectors focused mostly on evasion attacks [9, 10, 82, 95].
As highlighted by our previous categorization of adversarial attacks, evasion attacks aim to violate
system integrity by manipulating the input PDF file structure at test time.
In the specific case of attacks against PDF malware detectors, we can identify two main families of
evasion attacks, i.e., optimization-based [9, 10, 96, 102] and heuristic-based [20, 23, 62ś64, 82, 83, 95]
attacks. Optimization-based evasion attacks perform fine-grained modifications to the input sample
in order to minimize (maximize) the probability that the sample is classified as malicious (benign).
Heuristic-based evasion attacks aim also to create evasive samples, but they are not formulated in
terms of an optimization problem. They rather provide reasonable modifications that are expected
to cause misclassifications of malware samples at test time, including, e.g., trying to mimic the
structure of benign files. Within these two broad categories of attacks, we will then specify whether
each attack is carried out under a white-box, gray-box or black-box attack scenario, to highlight
the set of assumptions made on the attacker’s knowledge, as encompassed by our threat model. We
will also discuss the practical difficulties associated to concretely staging the attack against a real
system, which are inherent to the creation of the evasive (or adversarial) PDF malware samples.
This issue has been widely known in the literature of adversarial machine learning, especially for
optimization-based attacks, under the name of inverse feature-mapping problem [9, 13, 14, 45]. This
problem arises from the fact that most of the optimization-based attacks proposed thus far craft
evasive instances only in the feature space, i.e., by directly manipulating the values of the feature
vector x, without creating the corresponding evasive PDF malware sample. The modifications
are typically constrained to make it possible to create the actual PDF file, at least in principle;
however, how to invert the representation x to obtain the corresponding PDF file as z = ϕ −1 (x) has
only been rarely considered. This point is particularly of interest, and goes beyond the problem
of adversarial PDF malware; for example, adversarial attacks manipulating Android and binary
malware are subject to the same issues [29, 52]. The problem here amounts to creating a malware
sample exhibiting the desired feature vector, modified to evade the target system, which is not
always an easy task. For example, in the case of PDF malware, injecting material may change the
file semantics (i.e., the file may behave differently than the original one) or even break the malicious
functionality of the embedded exploitation code.
We summarize the attacks proposed thus far to craft adversarial PDF malware in Table 8. Our
taxonomy is organized along four axes: (i) the family of evasion attacks (i.e., either heuristic-based
or optimization-based); (ii) the attacker’s knowledge of the learning model (i.e., whether she has
white-box, gray-box or black-box access to the target classifier); (iii) the target system(s); and (iv)
whether the attack has been staged at the input level (i.e., by creating the adversarial PDF malware
sample) or at the feature level (i.e., by only modifying the feature values of the attack samples,
without creating the corresponding PDF files).
We provide below a more detailed description of the attacks that have been performed against
PDF malware detectors, following the proposed categorization (Sections 4.2.1-4.2.2). We then discuss
some insights on how to tackle the inverse feature-mapping problem and create adversarial PDF
, Vol. 1, No. 1, Article . Publication date: April 2019.
:20
D. Maiorca et al.
Table 8. An overview of the evasion attacks proposed thus far against PDF malware detectors. This taxonomy
considers the attack family (i.e., optimization- or heuristic-based), the attacker’s knowledge of the target
system (i.e., whether she has white-box, gray-box, or black-box access), the target system(s), and whether the
attack has been staged at the input level (i.e., creating the adversarial PDF malware samples) or at the feature
level (i.e., only manipulating their feature values, without actually creating the PDF files).
Heur./Opt.
Work
Year
Smutz and Stavrou [82]
Šrndić and Laskov [95]
Biggio et al. [9, 10]
Maiorca et al. [64]
Corona et al. [23]
Šrndić and Laskov [96]
Maiorca et al. [61, 62]
Carmony et al. [20]
Xu et al. [102]
Smutz and Stavrou [83]
Maiorca and Biggio [63]
2012
Heur.
2013
Heur.
2013
Opt.
2013
Heur.
2014
Heur.
2014
Opt.
2015
Heur.
2016
Heur.
2016
Opt.
2016 Heur., Opt.
2019
Heur.
Knowl. (B/G/W)
Target System(s)
Real Sample
G
B,W
G,W
B
B
W
B
B
B
B,W
B
PDFRate-v1
Hidost
Slayer
Wepawet, PDFRate-v1, PJScan
Lux0R
PDFRate
Wepawet, PDFRate-v1, PJScan, Slayer N.
PDFRate-v1, PJScan
PDFRate-v1, Hidost
PDFRate-v2
PDFRate-v1, PJScan, Slayer N., Hidost
No
No
No
Yes
No
Yes
Yes
Yes
Yes
Yes
Yes
malware samples. As we will see, this can be achieved by injecting content into PDF files through
the exploitation of vulnerabilities in their parsing mechanisms (Section 4.2.3).
4.2.1 Optimization-based Evasion Attacks. Evasion attacks (also known as adversarial examples)
consist of minimizing the classifier’s score on the malicious class f (x ′) with respect to the input
adversarial sample x ′ [9]. The changes to the input vector x ′ are constrained to reflect feasible
manipulations on the source malicious PDF file x = ϕ(z). In the case of PDF malware detection,
they are typically restricted to only consider the injection of content. The problem of optimizing
adversarial PDF malware samples has been originally formulated by Biggio et al. [9] as:
arg min
f (x ′) − λд(x ′) ,
(1)
x ⪯ x ′ , and ∥x ′ − x ∥1 ≤ ε ,
(2)
x′
s.t.
where the first constraint x ⪯ x ′ reflects content injection (i.e., each component of x ′ has to be
greater or equal than the corresponding one in x), and the second one bounds the number of
injected elements to be not greater than ε (as it essentially counts in how many elements x and x ′
are different). The function д(x ′) reflects how similar the manipulated sample x ′ is to the benign
distribution, and λ is a trade-off parameter. This trick was introduced in Biggio et al. [9, 10] to
facilitate evasion by avoiding poor local minima of f (x ′) (which do not typically allow the attack
algorithm to find a correct evasion point). In fact, when λ is small, the attacker may find an evasion
point by typically injecting very few objects, and such point is normally quite different from
both the malicious and benign training samples. When λ increases, the attack point is modified
in a more significant manner, but it also becomes harder to distinguish its manipulated feature
representation to that of benign samples. The aforementioned optimization problem is usually
solved by projected gradient descent [9, 10], and it has been used in follow-up work to generate
adversarial PDF malware [83, 96], adversarial binaries [52] and adversarial images against deep
networks [67]. With respect to the area of adversarial images against deep networks, many different
algorithms have been proposed to generate the so-called adversarial examples [19, 40, 60, 75, 88].
They are nevertheless all based on the idea of formulating the attack as an optimization problem,
and solve it using different gradient-based optimization algorithms. We refer the reader to [17] for
a comprehensive survey on this topic.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:21
To solve the aforementioned problem with gradient-based optimization, the attacker needs to
be able to estimate the gradient of the objective function, which in principle requires white-box
access to the target classifier f (x ′), and ideally also to the training data (to estimate the gradient
of д(x ′)). However, it has been shown in [9] that such gradient can be estimated reliably even if
the attacker has only gray-box access to the target system. The underlying idea is to first collect a
surrogate training set and query the target system to relabel it (e.g., if the target system is provided
as an online service).7 A surrogate learner fˆ can then be trained on such data, to approximate
the target function f , while an estimate д̂ of the benign sample distribution д can be obtained
directly from the surrogate data. Attacks can be thus staged against the surrogate learner and then
transferred to the target system. Within this simple trick, one can craft gradient-based attacks
with both white-box and gray-box access to the target learner.8 In the context of PDF malware,
white-box and gray-box gradient-based attacks have been used in [9, 10, 83, 96]. Black-box attacks
have been more recently proposed in [102], based on the idea of minimizing the value of f (x ′)
by only querying the target classifier in an iterative manner through using a genetic algorithm.
These three families of attack are described below. We also discuss how they can be seen as specific
instances of the aforementioned optimization problem.
White-Box Gradient-based Evasion Attacks. This category includes the gradient-based attacks
proposed by Biggio et al. [9, 10] and by Šrndić et al. [96], assuming white-box access to the
target model (which includes knowledge of its internal, trained parameters). Biggio et al. [9, 10]
focused on evading Slayer [65], and considering three different learning algorithms: SVMs with
the linear and the RBF kernel, and neural networks. Their results showed that, even with a minimal
number of injected keywords (less than 10 against the linear SVM), it is possible to evade PDF
malware detection. These results showed, for the first time, that non-linear malware detectors
can be vulnerable to test-time attacks (conversely to what had been previously reported in [95]).
Despite the constrained optimization problem in Eqs. (1)-(2) allows in principe the creation of the
corresponding PDF malware samples, this was not concretely demonstrated in [9] (as the analysis
was only limited to the manipulation of numerical feature values).
To overcome this limitation, Šrndić and Laskov [96] expanded the work in [9] by performing
a practical evasion of PDFRate-v1 [82]. They considered the same optimization problem with
white-box access to the target system, but injected the selected features for evasion directly into the
malicious PDF file, after the EOF tag. Albeit effective at evading the target system, such an injection
strategy may be easily countered by improving the parsing process [96]. Notably, as PDFRate-v1
employs standard decision trees (which are non-differentiable), the authors replaced the classifier
with a surrogate, differentiable SVM. Even with this characteristic, the attack can be considered
as white-box, as the attacker possesses complete knowledge of the target system. More recently,
Smutz et al. [83] tested the same attack against an evolved version of PDFRate (which we refer to
as PDFRate-v2), which adopted a customized ensemble of trees as a classifier. However, due to
the non-differentiable nature of the trees, the authors directly replaced them with an ensemble
of SVMs. Notably, the attack was performed directly against the SVM ensemble, which was not
used as a surrogate. The attained results showed that gradient-based attacks completely evaded the
system.
Gray-Box Gradient-based Evasion Attacks. This scenario has been explored by Biggio et al. [9,
10] and by Šrndić and Laskov [96]. We point out here its importance, as it shows what happens when
7 However,
it has been shown that if the surrogate data is well representative of the training data used by the target system,
the relabeling procedure is not even required [9, 30].
8 Despite follow-up work has defined these attacks as black-box transfer attacks [74], we prefer to name them gray-box
attacks, as such attacks implicitly assume that the attacker knows at least the feature representation.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:22
D. Maiorca et al.
gradient-descent based attacks are performed under more realistic scenarios. Notably, it would be
unfeasible for the attacker to have perfect knowledge of the target system (including the classifier’s
trained parameters). For this reason, as it is easily predictable, the efficacy of gradient-descent
attacks is reduced by the fact that the attacker has to learn a surrogate algorithm (and to use
surrogate data) to mount the attack. However, experiments performed against Slayer showed that
all classifiers were completely evaded by simply increasing the number of changes to the file. Most
were evaded after 30 − 40 changes, which means injecting that number of keywords. This attack
showed that, even under realistic scenarios, gradient descent can be a very effective approach to
evade PDF malware detectors.
Black-Box GA-based Evasion Attacks. This attack has been proposed by Xu et al. [102]. The
underlying idea is to optimize the objective function in Eq. (1) (with λ = 0, i.e., to only minimize
f (x ′)), using a genetic algorithm (GA) that iteratively queries the target system in a black-box
scenario. In this case, the input PDF malware sample is directly modified by injecting and deleting
features (accordingly, the constraint x ⪯ x ′ in Eq. (2) is not considered here), and then dynamic
analysis is used to verify that its malicious functionality has been preserved. This attack has been
staged against PDFRate-v1 and Hidost, and proven to be very effective at evading both systems.
While this attack is the only one that considers removal of objects from PDF files, it is also very
computationally demanding. The underlying reason is twofold: (i) the genetic algorithm used in
the optimization process requires performing a large number of queries (as it does not exploit
any structured information like gradients to drive the search in the optimization space), and (ii)
the attack requires running dynamic analysis on the manipulated samples. While this may not
be an issue for an attacker whose goal is only developing successful attacks, GA-based evasion
may not be suitable for generating attacks under limited time, computational resources and, most
importantly, number of queries to the target system.
4.2.2 Heuristic-based Evasion Attacks. Heuristic-based evasion attacks had been largely used
before optimization-based attacks were found to be more effective. They include strategies that do
not directly optimize an objective function, and this is why they are typically less effective than
optimization-based attacks. Heuristic-based attacks attempt evasion by making malicious samples
more similar to benign ones. This effect is typically obtained either by adding benign features
to malicious files (mimicry) or by injecting malicious payload in benign files (reverse mimicry).
According to the knowledge possessed by the attacker, we distinguish among the three evasion
strategies discussed below.
White-Box Mimicry. The main idea of this attack is manipulating malware samples to make them
as close as possible to the benign data (in terms of their conditional probability distribution or
distance in feature space), under white-box access to the trained model, i.e., by injecting the most
discriminant features for the target classifier directly into the input sample. This implementation
was adopted in [82] to bypass PDFRate-v1. The attained results showed that it was possible to
decrease the classifier accuracy by more than 20% by only injecting the top 6 discriminant features
in each malware sample. A similar approach was adopted in [95], where the authors tested the
robustness of Hidost by injecting the features that would influence the classifier decision the
most. The attained result showed that it was possible to evade the tree-based classifiers with high
probability, but this was not the case for nonlinear SVMs with the RBF kernel.
Black-Box Mimicry. In this strategy, the attacker injects into a malicious sample all the objects
contained in a benign file, aiming to subvert the classifier decision. To this end, the attacker only
needs to have black-box access to the target classifier and to collect some benign PDF files. The
easiest way to perform this attack is to blindly copy the entire content of a benign PDF file into the
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:23
Fig. 3. Conceptual flow of mimicry (top) and reverse-mimicry (bottom) attacks.
malicious sample. This approach was adopted by Corona et al. [23] to test the robustness of Lux0R.
In particular, the authors added all the JavaScript API-based features belonging to multiple benign
samples to the given malicious PDF files and measured how the classifier detection was affected.
They also measured the changes in detection rate as the number of injected files increased. Results
showed that, due to the dynamic nature of the features employed, the classifier was still able to
detect most of the malicious samples in the wild, despite the modifications they received. Šrndić
and Laskov [95] also adopted a similar approach to verify the robustness of Hidost (trained, in this
case, with an RBF SVM - a different case to the tests they made with white-box mimicry against
tree classifiers), showing that their approach was robust against this kind of mimicry.
Black-Box Reverse Mimicry. In this attack, the attacker injects a specifically-crafted, malicious
payload into a file that is recognized as benign by the classifier, by only exploiting black-box access
to the classifier. The idea is that the corresponding feature modifications should not be sufficient
to change the classification of the source sample from benign to malicious [62ś64, 83]. Such a
strategy is exactly the opposite of mimicry attacks: while the latter inject benign information into
a malicious file, reverse mimicry tends to minimize the amount of malicious information injected
(by changing the injected payload when the attack fails). This difference is depicted in the flow
diagrams of Figure 3.
There are three types of reverse-mimicry attacks, according to different types of malicious
information that can be injected:
• JS Embedding. This approach injects JavaScript codes directly inside the file objects. Current
implementations feature the injection of only one JavaScript code, but it is technically
possible to scatter the code through multiple objects in the file. Notably, this would increase
the probability of the attack to be detected, as more keywords have to be used to indicate the
presence of the embedded code.
• EXE Embedding. This technique exploits the CVE-2010-1240 vulnerability, thanks to which
it is possible to run an executable from the PDF file itself directly. In the implementation
, Vol. 1, No. 1, Article . Publication date: April 2019.
:24
D. Maiorca et al.
proposed by Maiorca et al. [64], the EXE payload was injected through the creation of a
different version.
• PDF Embedding. This strategy features the injection of malicious PDF files in the objects of
the target PDF file. More specifically, attackers can inject specific keywords that cause the
embedded file to open automatically. This technique can be particularly effective, especially
because multiple embedding layers can be easily created (for example, embedding a PDF file
in another PDF file, which is finally embedded in another file). In [62], PeePDF was employed
to carry out such embedding strategy. However, the embedding process is prone to bugs, due
to the complexity of the PDF format. Nevertheless, it is possible to employ libraries such as
Poppler to improve the injection process.
The efficacy of reverse-mimicry attacks has been explored in various works. Maiorca et al. [64].
demonstrated that all reverse mimicry variants were able to evade systems that adopted structural
features, such as PDFRate-v1. Further works [61, 62, 83] proposed possible strategies to mitigate
such attacks, which will be described more in detail in Section 5.
We now summarize the results attained by testing evasive examples created with reverse-mimicry
strategies. In particular, we report the results attained by [63], in which 500 samples for each reverse
mimicry variant (for a total of 1500 samples) were created and tested against multiple systems in
the wild [54, 61, 82, 83, 95]. Such comparison is the most recent and fair between multiple systems
on evasive datasets. All systems (except for PDFRate-v1, which was provided as an online service9 )
were trained with the same dataset composed by more than 20000 malicious and benign files in the
wild. Figure 4 reports the attained results. For each attack and system, we report the percentage of
evasive samples that were detected. Note that Slayer NEO has been tested in two variants: the keys
variant reflects the work made in [65] (in which the system only operated by extracting keywords
as features), while the all variant reflects the full functionality of the system, as tested in [61].
Results clearly show that each system has its strengths and weaknesses. For example, Slayer
NEO well performs against PDF embedding attacks due to its capability of extracting and analyzing
embedded files, while PJscan is excellent at detecting JS embedding attacks and PDFRate-v1
provides reliable detection of EXE embedding. However, none of the tested systems can effectively
detect all reverse-mimicry attacks.
Other Black-Box Attacks. Other black-box attacks involve empirical attempts to exploit vulnerabilities of the components that belong to the target detectors, such as their parsers. This concept has
been explored on a broader way by Carmony et al. [20], who showed that each parser (including
advanced ones) on which PDF detectors are based could be evaded due to implementation bugs,
design errors, omissions, and ambiguities. For this reason, attackers can be motivated to mostly
target the pre-processing modules of detection systems, thus efficiently exploiting their vulnerabilities. Indeed, the authors created working proofs of concept with which they were able to evade
both third-party and custom (PDFRate) parsers.
4.2.3 Practical Implementation. As already mentioned in Section 4.2 and in Table 8, a critical
problem of adversarial attacks is the creation of the real attack samples starting from the evasive
feature vectors (i.e., the so-called inverse feature-mapping problem). More specifically, the goal of
the attacker is injecting/removing information into/from the PDF file while keeping its semantics
intact (i.e., the file should exactly work as the unaltered version). However, due to the objectbased structure of the PDF format (see Section 2), manipulating each feature value independently
may not always be feasible. The corresponding file manipulations may compromise its malicious
functionality, especially if the attack also requires removing content.
9 PDFRate-v2
was not available during the experiemnts in [63].
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
Slayer NEO (All)
35.9
9.4
96.0
17.8
Slayer NEO (Keys)
59.8
94.8
28.8
PDFRate
95.2
1.2
PJScan
87.7
1.0
3.0
40.8
Hidost
20
40
PDF Embedding
EXE Embedding
JS Embedding
69.0
1.0
0
:25
60
80
100
Fig. 4. Results attained by detection systems against reverse mimicry attacks. Each system has been tested
against 500 samples for each attack variant (1500 samples in total). Results are reported by showing the
percentage of detected samples.
Table 8 showed which works implemented the real attack samples either by addressing the inverse feature-mapping problem or by directly manipulating the PDF malware sample. In both cases,
three major strategies were used to inject material while minimizing the risk of compromising the
intrusive functionality of PDF malware, as depicted in Figure 5 [63]. We now provide a comprehensive description of each technique, by also referring to the works where they have been employed.
Worth noting, these strategies can also be exploited to address the inverse feature-mapping problem
and implement concrete attacks at the input level from the feature-level attack strategies proposed
in [9, 10, 23, 82, 95].
(a) Injection after X-Ref. In this strategy, objects are injected after the X-Ref table, and in particular
after the EOF marker of the file. These objects are never parsed by Adobe Reader or similar, as
their presence is not marked in the cross-reference table. This strategy is easy to implement, but
it strongly relies on exploiting vulnerabilities of the parsing process. Like clearly explained by
Carmony et al. [20], all PDF detector parsers suffer from vulnerabilities, as none of them fully
implements the PDF specifications released by Adobe. Such weaknesses are particularly evident
in custom parsers, which may hide some strong misalignments to the behavior of Adobe Reader.
However, such injection strategy can be made ineffective by merely patching the pre-processing
module of the PDF malware detector to be consistent with Adobe Reader. Injection after X-Ref
has been mostly employed by Šrndić and Laskov [96] to evade PDFRate-v1, which employs a
customized parser. Albeit easy to counteract, this strategy has been used to simplify the injection of
different types of structural features (for example, upper/lower-case characters). The same strategy
has been employed by Smutz et al. [83] for their experiments.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:26
D. Maiorca et al.
Header
Header
Body
Body
Header
Extended Body
Cross-reference
Table
Cross-reference
Table
Injected Content
Trailer
Trailer
Body (V1)
Extended Crossreference Table
(a)
Trailer
(c)
X-Ref Table (V1)
Trailer (V1)
(b)
Fig. 5. Content injection in PDF files, performed according to three possible strategies: (a) Injection after
X-Ref; (b) Versioning; and (c) Body Injection.
(b) Versioning. In this strategy, attackers use the versioning mechanism of the PDF file format,
i.e., injecting a new body, X-Ref table, and trailer, as the user directly modified the file (e.g., by
using an external tool - see Section 2. This strategy is more advanced than the previous one,
as the Cross-Reference table parses the injected objects, and therefore they are considered as
legitimate by the Reader itself. In this way, it is also straightforward to add embedded files or other
malicious (or benign) contents that can be used as an aid to perpetuate a more effective attack.
Such injection strategy can, however, be countered by extracting the content of each version of
the PDF file and process it separately. Maiorca et al. [61ś64] employed this strategy to generate
EXE Embedding attacks (belonging to reverse mimicry). More specifically, they leveraged the
Metasploit framework [1] to automatically generate the infected evasive samples. In this way,
they showed that the versioning mechanism could be easily automatized by off-the-shelf tools to
generate evasive variants.
(c) Body Injection. In this strategy, attackers directly operate on the existing PDF graph, adding
new objects to the file body and re-arranging the X-Ref table accordingly. This strategy is more
complicated to implement and to detect, as it seamlessly adds objects in a PDF file by reconstructing
the objects and their position on the X-Ref Table. In this way, it is possible to manage and re-arrange
the X-Ref table objects without corrupting the file [20]. Existing objects can also be modified by
adding other name objects and rearranging the X-Ref table positions accordingly. Notably, it is
essential to ensure that the embedded content (i.e., the exploitation code) is correctly executed
when the merged PDF is opened. The correct execution of the embedded content is often not easy
to achieve, as it requires injecting additional objects specifically for this purpose. This strategy was
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:27
employed by Maiorca et al. [61ś64] (and also used by Smutz et al. [83] and Carmony et al. [20]) to
generate JS and PDF embedding attacks. In this way, the attacker can exactly choose in which part
of the file body the malicious payload can be injected, thus making the automatic detection of such
attacks harder.
An extension of the aforementioned body-injection strategy has been adopted by Xu et al. [102],
to account also for object removal and replacement. In particular, after each manipulation of the
file body (including object addition, removal or replacement), the evasive sample is automatically
tested on a a Cuckoo sandbox. If the attempted change disrupts the functionality of the file (i.e., the
PDF malware does not contact a malicious target URL anymore), the original file is restored. Albeit
computationally expensive, this strategy allows one to precisely verify whether it is possible to
perform such specific changes to the source PDF file.
4.3
Poisoning and Privacy Attacks against PDF Malware Detectors
In this Section, we discuss two other popular categories of attack defined in the literature of
adversarial machine learning, under the name of poisoning and privacy attacks [17, 45]. Even
though, to our knowledge, such attacks have never been considered in the context of PDF malware
detection, we discuss here how they can be used to pose new threats to PDF malware detectors.
4.3.1 Poisoning Attacks. Poisoning attacks aim to reduce the detection capabilities of PDF
detectors by injecting wrongly-labeled samples in the classifier training set. According to the
taxonomy proposed in Section 4.1.4, we distinguish between three possible attacks:
Poisoning Availability Attacks. This attack can be carried out against online services that ask
for the user feedback about the classification results (such as PDFRate-v1, which used to be online
available). More specifically, the attacker can submit several malicious PDF files for the analysis.
When the system asks for feedback, she can intentionally claim that such samples are benign, hoping
that the system gets retrained by including the wrongly-labeled samples. If the attacker has no
explicit control on the labeling process, she may construct benign samples which contain spurious
malicious features, and malicious samples with benign content, in a way that preserves their original
labeling. Similar attacks have been staged against anti-spam filters [45, 71], to eventually increase
the classification error at test time, and cause a denial of service to legitimate users.
Poisoning Integrity Attacks. This attack is similar to the one employed in poisoning availability.
However, instead of aiming to increase the classification error at test time to cause a denial of
service, the goal of the attacker here is to cause the misclassification of a specific subset of malicious
PDF files at test time. For example, given a PDF malware sample which is correctly recognized
by the target system, the attacker may start injecting benign samples with spurious malicious
characteristics extracted from the malicious PDF file into the training set of the target classifier.
Once updated, the target classifier may thus no longer correctly detect the given PDF malware
sample. Overall, poisoning integrity attacks aim to facilitate evasion at test time.
Backdoor Attacks. The goal of this attack is the same as poisoning integrity, i.e., to facilitate
evasion at test time, even though the attack strategy may be implemented in a different manner. In
particular, the attacker may publicly release the backdoored model, which may be subsequently
used in some publicly-available online services or commercial tools. If this happens, the attacker
can craft her malicious samples (including the backdoor activation signature) to bypass detection.
4.3.2 Privacy Attacks. Privacy attacks in the context of PDF malware detection may aim to
primarily steal information about the classification model or the training data. They can be organized
into three categories:
, Vol. 1, No. 1, Article . Publication date: April 2019.
:28
D. Maiorca et al.
Model Stealing/Extraction Attacks. This attack can be performed on systems whose classification scores (or other information) are available [92]. The attacker can submit PDF files with
progressive modifications to their structure (or scripting code) to infer which changes occur to the
information retrieved from the models (in a similar fashion to what performed by [102]). In this
way, the attacker may eventually reconstruct the detection model with high accuracy, and sell an
alternative online service at a cheaper price.
Model Inversion Attacks. The attack strategy is similar to model stealing attacks, but the goal is
reconstructing the employed training data [36]. This may violate user privacy if the attacker is able
to reconstruct some benign PDF sample containing private information.
Membership Inference Attacks. This attack strategy is similar to the previous two cases, but the
goal is to understand if a specific sample was used as part of the training data. For example, this can
be useful to infer if the system was trained with PDF files that can be easily found on search engines,
or if the system resorts to data available from publicly-distributed malware datasets [70, 81].
5 OPEN ISSUES AND RESEARCH CHALLENGES
We discuss here the current open issues related to PDF malware detection and sketch some promising future research directions. More specifically, further research can be developed from two
perspectives, both suggested from the research area of adversarial machine learning: (i) demonstrating novel adversarial attacks and potential threats against PDF malware detectors, and (ii)
improving their robustness under attack, by leveraging previous work on defensive mechanisms
from adversarial machine learning. In the following, we discuss both perspectives.
5.1 Adversarial Attacks
Considering what we pointed out in Sections 4.2 and 4.3, it is evident that research on attacks
against learning-based PDF malware detectors mainly focused on evasion. In particular, state-ofthe-art work has been carried out by following two main directions: on the one hand, the creation
of concrete, working adversarial examples by using empirical approaches, with the drawback of
increasing the probability of failing the attacks due to too limited knowledge of the target system;
on the other hand, the development of evasion algorithms, leading to approaches that allowed very
efficient evasion without the creation of real samples (due to, for example, changes that could not be
correctly performed in practice, such as deleting specific keywords). Notably, one critical problem
to be solved is related to inverse feature mapping, i.e., creating real samples from the corresponding
evasive feature vectors. Normally, to preserve the original behavior of the file, injecting information
is typically safer than removing it. Nevertheless, such an action could compromise the overall file
semantics and visualization properties. This effect could be triggered, e.g., by embedding font- or
pages-related keywords in specific objects.
Recent work has demonstrated that it is possible to remove specific structural features (e.g.,
keywords) from PDF files without compromising their functionality. [102]. However, there is still a
lot of space for research on this topic. More specifically, attackers could focus on identifying a set of
features that can be safely deleted (depending on the file context), or even replaced with equivalent
ones. For example, one could remove every reference to JavaScript code and replace them with
another language such as ActionScript. Concerning this aspect, it would also be intriguing to
inspect the dependence between certain features (for example, deleting one keyword may force
the attacker also to delete a second one). In this way, one may think of increasing the efficacy of
gradient-descent algorithms by including a selection of erasable/replaceable information.
Finally, alternatives to evasion attacks can be explored. As stated in Section 4.3, poisoning and
privacy attacks have yet to be carried out against PDF learners. One particularly interesting aspect
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:29
is how such attacks can be useful in practice against current detection systems. For example,
poisoning PDF detectors may compromise the performances of publicly available (or even open
source) learning models models. Likewise, model-stealing strategies can be employed to extract the
characteristics of unknown detectors, leading to the development of more effective evasion attacks.
5.2
Robust Malware Detection
From what we pointed out in Sections 3 and 4, it is clear that every released detector features
specific weaknesses that are either related to its parser, feature extractor, or classifier. However,
while most state-of-the-art works focused on proposing evasion strategies, only a few pointed
out and tested possible countermeasures against such attacks. In the following, we summarize the
proposed mitigation approaches with the same organization proposed in Section 4.2.
Detection of Optimization-based Attacks. The only proposed approach to detect white- and
gray-box gradient-based attacks against PDF files has been proposed by Smutz et al. [83], who
proposed an improvement of the common tree-based ensemble detection mechanism used for PDF
detectors. By considering the voting results of each tree-based component of the ensemble, the
authors defined a region of uncertainty in which they classified evasion samples that obtained a
score between specific thresholds. In this way, albeit the same samples were not explicitly regarded
as malicious, the uncertain label would be enough to warn the user about possible evasion attempts.
The attained results showed that PDFRate-v2 performed significantly better than the previous
PDFRate-v1 at detecting the attack proposed by Šrndić and Laskov [96].
Detection of Heuristic-based Attacks. Concerning white-box mimicry, Smutz et al. [82] proposed a mitigation strategy in which the most discriminant features can be directly removed from
the feature vector. However, such a choice can significantly impact the detection performances.
To detect black-box reverse mimicry attacks, Maiorca et al. [62ś64] proposed a combination of
features that are extracted both from the structure and the content (in terms of embedded code) of
the file, along with the extraction and separate analysis of embedded PDF files. In this way, it is
possible to significantly mitigate such attacks (especially PDF Embedding ones, which are entirely
detected), although more than 30% of JS Embedding and EXE Embedding variants still managed to
bypass detection [61ś63]. For the same problem, Smutz et al. [83] employed the same mitigation
strategy proposed for detectin optimization-based attacks, and showed that the detection of reverse
mimicry attacks (in particular, JS Embedding) significantly improved. An exception to such an
improved detection is the PDF embedding variant, which still manages to evade the system, as
even PDFRate-v2 does not perform any analysis of possible embedded PDF payloads.
Despite previous efforts at detecting evasion attacks, it is clear that a bulletproof solution to
detect such attacks has not yet been developed. Accordingly, we propose three research guidelines
that can be further expanded and applied both against optimization- and heuristic-based attacks.
Parsing Reliability. Proper parsing should always include, among the others: (i) the extraction of
the embedded content (that should be analyzed separately, either with the same or with different
detectors); (ii) using consolidated parsers (e.g., third-party libraries or, in general, parsers whose
specifications are the closest to the official specs); (iii) robustness against malformed files (including
benign ones) that may make the parser crash. Further research could focus on implementing these
three characteristics and discussing their impact on the overall detection performances.
Feature Engineering. Following the results obtained by Maiorca et al. [62ś64], it is clear that
a proper feature engineering would drastically increase the robustness of the learning systems
against black- and white-box attacks. For this reason, the key research point to be followed should
be designing features that are hard to be modified from the perspective of the attacker. Dynamic
features are generally more robust, as the attacker should change the file behavior in order to change
, Vol. 1, No. 1, Article . Publication date: April 2019.
:30
D. Maiorca et al.
the feature value. However, when it is not possible or feasible to analyze PDF files dynamically,
combining different types of static features can be a winning solution that may significantly increase
the effort that is required by the attacker to evade the target systems.
Robust and Explainable Learning Algorithms. Research should push on developing algorithms
that are more robust against perturbations made by attackers. To this end, explicit knowledge of
the attack model has to be considered during the design of the learning algorithm. One possibility
is to use robust optimization algorithms or, more generally, game-theoretical models. Such models
have been widely studied in the literature of adversarial machine learning [18, 39], largely before
the introduction of adversarial training [40], which essentially follows the same underlying idea.
Under this game-theoretical learning framework, the attacker and the classifier maximize different
objective functions.10 In particular, while the attacker manipulates data within certain bounds to
evade detection, the classifier is iteratively retrained to correctly detect the manipulated samples [17,
18, 39, 77]. We argue that game-theoretical models may be helpful to improve robustness of
PDF malware detectors. Explainable machine-learning algorithms may provide another useful
asset for system designers to analyze the trained models and understand their weaknesses. This
approach has been recently adopted to inspect the security properties of learning systems and
highlight their vulnerabilities [6, 28, 44, 68, 76]. In particular, it has been already observed (e.g., in
spam filtering [12, 51] and Android malware detection [29]) that learning algorithms may tend to
overemphasize some features, and that this facilitates evasion by a skilled attacker who carefully
manipulates only those feature values. Developing specific learning approaches that distribute
feature importance more evenly can significantly improve robustness to such adversarial evasion
attempts [12, 29, 51, 68]. This has been clearly demonstrated by Demontis et al. [29] in the context
of adversarial Android malware detection. In that work, the authors have shown that classifiers
trained using a principled approach based on robust optimization eventually provide more-evenly
distributed feature weights and much stronger robustness guarantees (even than the robust deep
networks developed for the same task in [42]). Another defensive mechanism against evasion attacks
consists of rejecting anomalous samples at test time, similarly to the uncertainty-region approach
defined by Smutz et al. [83]. More specifically, when a test sample is assigned an anomalous score
in comparison to the scores typically assigned to benign or malware samples, it can be flagged
as anomalous or classified as suspicious, requiring further manual inspection (see, e.g., the work
in [67] for an example on adversarial robot vision). Defensive mechanisms have also been proposed
against poisoning and privacy attacks, respectively based on training data sanitization and robust
learning, and on leveraging differentially-private mechanisms for learning private classifiers (we
refer the reader to [17] for further details).
We finally argue that the aforementioned guidelines are not only useful for PDF malware
detectors, but they can be extended to other malware detection systems that may be targeted by
adversarial attacks. We believe that a correct application of the principles described above can be
genuinely beneficial and can constitute a substantial aid to increase security of such systems.
6 CONCLUSIONS
In this paper, we presented a survey of malicious PDF detection in adversarial environments,
featuring multi-folded contributions. First, we described how malicious PDF files perform their
attacks in the wild. Accordingly, we described all machine learning-based solutions for PDF malware
detection that have been proposed in the last decade. Notably, this part differentiates from previously
proposed surveys, such as the ones by Nissim et al. [72] and Elingiusti et al. [32], which did not focus
10 For
the sake of completeness, it is worth pointing out that, in robust optimization, they maximize the same objective with
opposite sign, yielding a zero-sum game
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:31
on those system components that are crucial to understanding adversarial attacks. For example,
our work provided a deep insight into the pre-processing of PDF files, which has been exploited
by many adversarial attacks. Second, we provided a comprehensive overview of the adversarial
attacks that have been carried out against PDF malware detectors. In particular, we categorized
the actual adversarial attacks against PDF detectors under a unifying framework and described
them by also sketching possible novel strategies that can be employed by attackers. To this end, we
employed a methodology similar to previous work on command-and-control botnets [38]. However,
PDF malware detection is clearly a different task and, to the best of our knowledge, this is the first
work that thoroughly described adversarial machine learning applied to this field. The significant
adversarial-related traits of this survey also completely distinguish it from other works on data
mining for malware detection; for example, the work in [103] is not focused on the adversarial
aspects of the problem. Despite the work by Cuan et al. [25] pointed out some basics aspects of
how adversarial machine learning has been applied to the detection of malicious PDF files, our
work expands and discusses such problem in much wider details. Finally, we discussed existing
mitigation strategies and sketched new research directions that could allow detecting not only
current adversarial attacks but also novel ones.
Notably, the main goal of this work is that of discussing how PDF malware analysis has brought,
across these recent years, significant advancements in understanding how adversarial machine
learning could be applied to malware detection. The recent results described in this work have been
beneficial in both research fields. On the one hand, research demonstrated that adversarial attacks
constitute a concrete, real emerging security threat that can be extremely dangerous, as machine
learning is now widely employed even by standard anti-malware solutions. On the other hand,
discoveries concerning adversarial attacks pushed developers and security analysts to develop better
protections and to explore novel information about malware that can be useful for classification.
We believe that future work should focus on a more rigorous application of security-by-design
principles when building detection systems. In this way, it will be possible to offer real, active
mitigation of the numerous evasion attacks that are being progressively used in the wild.
ACKNOWLEDGMENTS
This work was supported by the INCLOSEC project, funded by the Sardinian Regional Administration (CUP G88C17000080006).
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[n. d.]. Metasploit framework: Http://www.metasploit.com/. ([n. d.]).
Adobe. 2006. PDF Reference. Adobe Portable Document Format Version 1.7.
Adobe. 2007. JavaScript for Acrobat API Reference.
Adobe. 2008. Adobe Supplement to ISO 32000.
Anish Athalye, Nicholas Carlini, and David A. Wagner. 2018. Obfuscated Gradients Give a False Sense of Security:
Circumventing Defenses to Adversarial Examples. In ICML (JMLR Workshop and Conference Proceedings), Vol. 80.
JMLR.org, 274ś283.
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert Müller.
2010. How to Explain Individual Classification Decisions. J. Mach. Learn. Res. 11 (2010), 1803ś1831.
Marco Barreno, Blaine Nelson, Anthony Joseph, and J. Tygar. 2010. The security of machine learning. Machine
Learning 81 (2010), 121ś148. Issue 2.
Marco Barreno, Blaine Nelson, Russell Sears, Anthony D. Joseph, and J. D. Tygar. 2006. Can machine learning be
secure?. In Proc. ACM Symp. Information, Computer and Comm. Sec. (ASIACCS ’06). ACM, New York, NY, USA, 16ś25.
B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli. 2013. Evasion attacks against
machine learning at test time. In European Conference on Machine Learning and Principles and Practice of Knowledge
Discovery in Databases (ECML PKDD), Part III (Lecture Notes in Computer Science), Hendrik Blockeel, Kristian Kersting,
Siegfried Nijssen, and Filip Železný (Eds.), Vol. 8190. Springer Berlin Heidelberg, 387ś402.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:32
D. Maiorca et al.
[10] Battista Biggio, Igino Corona, Blaine Nelson, BenjaminI.P. Rubinstein, Davide Maiorca, Giorgio Fumera, Giorgio
Giacinto, and Fabio Roli. 2014. Security Evaluation of Support Vector Machines in Adversarial Environments. In
Support Vector Machines Applications, Yunqian Ma and Guodong Guo (Eds.). Springer International Publishing, 105ś153.
https://doi.org/10.1007/978-3-319-02300-7_4
[11] Battista Biggio, Luca Didaci, Giorgio Fumera, and Fabio Roli. 2013. Poisoning attacks to compromise face templates.
In 6th IAPR Int’l Conf. on Biometrics (ICB 2013). Madrid, Spain, 1ś7.
[12] Battista Biggio, Giorgio Fumera, and Fabio Roli. 2010. Multiple Classifier Systems for Robust Classifier Design in
Adversarial Environments. Int’l J. Mach. Learn. and Cybernetics 1, 1 (2010), 27ś41.
[13] Battista Biggio, Giorgio Fumera, and Fabio Roli. 2014. Pattern Recognition Systems under Attack: Design Issues and
Research Challenges. IJPRAI 28, 7 (2014).
[14] Battista Biggio, Giorgio Fumera, and Fabio Roli. 2014. Security Evaluation of Pattern Classifiers Under Attack. IEEE
Transactions on Knowledge and Data Engineering 26, 4 (April 2014), 984ś996.
[15] Battista Biggio, Giorgio Fumera, Fabio Roli, and Luca Didaci. 2012. Poisoning Adaptive Biometric Systems. In
Structural, Syntactic, and Statistical Pattern Recognition, Georgy Gimel’farb, Edwin Hancock, Atsushi Imiya, Arjan
Kuijper, Mineichi Kudo, Shinichiro Omachi, Terry Windeatt, and Keiji Yamada (Eds.). Lecture Notes in Computer
Science, Vol. 7626. Springer Berlin Heidelberg, 417ś425. http://dx.doi.org/10.1007/978-3-642-34166-3_46
[16] Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines, In 29th
Int’l Conf. on Machine Learning, John Langford and Joelle Pineau (Eds.). Int’l Conf. on Machine Learning (ICML),
1807ś1814.
[17] Battista Biggio and Fabio Roli. 2018. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern
Recognition 84 (2018), 317 ś 331. https://doi.org/10.1016/j.patcog.2018.07.023
[18] Michael Brückner, Christian Kanzow, and Tobias Scheffer. 2012. Static Prediction Games for Adversarial Learning
Problems. J. Mach. Learn. Res. 13, 1 (Sept. 2012), 2617ś2654. http://dl.acm.org/citation.cfm?id=2503308.2503326
[19] Nicholas Carlini and David A. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In IEEE
Symposium on Security and Privacy. IEEE Computer Society, 39ś57.
[20] Curtis Carmony, Xunchao Hu, Heng Yin, Abhishek Vasisht Bhaskar, and Mu Zhang. 2016. Extract Me If You Can:
Abusing PDF Parsers in Malware Detectors. In NDSS. The Internet Society.
[21] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. ZOO: Zeroth Order Optimization Based
Black-box Attacks to Deep Neural Networks Without Training Substitute Models. In 10th ACM Workshop on Artificial
Intelligence and Security (AISec ’17). ACM, New York, NY, USA, 15ś26.
[22] X. Chen, C. Liu, B. Li, K. Lu, and D. Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data
Poisoning. ArXiv e-prints abs/1712.05526 (2017).
[23] Igino Corona, Davide Maiorca, Davide Ariu, and Giorgio Giacinto. 2014. Lux0R: Detection of Malicious PDF-embedded
JavaScript code through Discriminant Analysis of API References. In Proc. 2014 Workshop on Artificial Intelligent and
Security Workshop (AISec ’14). ACM, New York, NY, USA, 47ś57.
[24] Marco Cova, Christopher Kruegel, and Giovanni Vigna. 2010. Detection and Analysis of Drive-by-download Attacks
and Malicious JavaScript Code. In Proceedings of the 19th International Conference on World Wide Web (WWW ’10).
ACM, New York, NY, USA, 281ś290. https://doi.org/10.1145/1772690.1772720
[25] Bonan Cuan, Aliénor Damien, Claire Delaplace, and Mathieu Valois. 2018. Malware Detection in PDF Files using
Machine Learning. PhD Dissertation, REDOCS (2018).
[26] Nilesh Dalvi, Pedro Domingos, Mausam, Sumit Sanghai, and Deepak Verma. 2004. Adversarial classification. In Tenth
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). Seattle, 99ś108.
[27] Hung Dang, Yue Huang, and Ee-Chien Chang. 2017. Evading Classifiers by Morphing in the Dark. In ACM SIGSAC
Conf. Computer and Comm. Sec. (CCS ’17). ACM, 119ś133.
[28] Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2019. Explaining Vulnerabilities of Deep Learning to Adversarial Malware Binaries. In 3rd Italian Conference on Cyber Security, ITASEC, Vol. 2315.
CEUR Workshop Proceedings.
[29] Ambra Demontis, Marco Melis, Battista Biggio, Davide Maiorca, Daniel Arp, Konrad Rieck, Igino Corona, Giorgio
Giacinto, and Fabio Roli. In press. Yes, Machine Learning Can Be More Secure! A Case Study on Android Malware
Detection. IEEE Trans. Dependable and Secure Computing (In press).
[30] Ambra Demontis, Marco Melis, Maura Pintor, Matthew Jagielski, Battista Biggio, Alina Oprea, Cristina Nita-Rotaru,
and Fabio Roli. 2018. Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning
Attacks. arXiv e-prints, Article arXiv:1809.02861 (Sep 2018), arXiv:1809.02861 pages. arXiv:1809.02861
[31] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Xiaolin Hu, and Jun Zhu. 2018. Boosting Adversarial Examples with
Momentum. In CVPR.
[32] Michele Elingiusti, Leonardo Aniello, Leonardo Querzoni, and Roberto Baldoni. 2018. PDF-Malware Detection: A
Survey and Taxonomy of Current Techniques. Springer International Publishing, Cham, 169ś191. https://doi.org/10.
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:33
1007/978-3-319-73951-9_9
[33] ESET. 2018. A Tale of two zero-days. (2018). https://www.welivesecurity.com/2018/05/15/tale-two-zero-days/
[34] Jose Miguel Esparza. 2017. PeePDF. (2017). Http://eternal-todo.com/tools/peepdf-pdf-analysis-tool
[35] Fortinet. 2016.
Analysis of CVE-2016-4203 - Adobe Acrobat and Reader CoolType Hanhttps://www.fortinet.com/blog/threat-research/
dling Heap Overflow Vulnerability.
(2016).
analysis-of-cve-2016-4203-adobe-acrobat-and-reader-cooltype-handling-heap-overflow-vulnerability.html
[36] Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks That Exploit Confidence
Information and Basic Countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and
Communications Security (CCS ’15). ACM, New York, NY, USA, 1322ś1333.
[37] FreeDesktop.org. 2018. Poppler. (2018). https://poppler.freedesktop.org/
[38] Joseph Gardiner and Shishir Nagaraja. 2016. On the Security of Machine Learning in Malware C&C Detection: A
Survey. ACM Comput. Surv. 49, 3, Article 59 (Dec. 2016), 39 pages. https://doi.org/10.1145/3003816
[39] Amir Globerson and Sam T. Roweis. 2006. Nightmare at test time: robust learning by feature deletion. In Proceedings
of the 23rd International Conference on Machine Learning, William W. Cohen and Andrew Moore (Eds.), Vol. 148. ACM,
353ś360.
[40] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples.
In International Conference on Learning Representations.
[41] Google. 2018. Virustotal. (2018). http://www.virustotal.com
[42] Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick D. McDaniel. 2017. Adversarial
Examples for Malware Detection. In ESORICS (2) (LNCS), Vol. 10493. Springer, 62ś79.
[43] Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. 2017. BadNets: Identifying Vulnerabilities in the Machine
Learning Model Supply Chain. In NIPS Workshop on Machine Learning and Computer Security, Vol. abs/1708.06733.
[44] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2019. A
Survey of Methods for Explaining Black Box Models. ACM Comput. Surv. 51, 5 (2019), 93:1ś93:42.
[45] L. Huang, A. D. Joseph, B. Nelson, B. Rubinstein, and J. D. Tygar. 2011. Adversarial Machine Learning. In 4th ACM
Workshop on Artificial Intelligence and Security (AISec 2011). Chicago, IL, USA, 43ś57.
[46] M. Jagielski, A. Oprea, B. Biggio, C. Liu, C. Nita-Rotaru, and B. Li. 2018. Manipulating Machine Learning: Poisoning
Attacks and Countermeasures for Regression Learning. In IEEE Symposium on Security and Privacy (SP ’18). IEEE CS,
931ś947. https://doi.org/10.1109/SP.2018.00057
[47] Yujie Ji, Xinyang Zhang, Shouling Ji, Xiapu Luo, and Ting Wang. 2018. Model-Reuse Attacks on Deep Learning
Systems. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18).
ACM, New York, NY, USA, 349ś363.
[48] Kaspersky. 2017. Machine Learning for Malware Detection.
[49] Marius Kloft and Pavel Laskov. 2012. Security Analysis of Online Centroid Anomaly Detection. Journal of Machine
Learning Research 13 (2012), 3647ś3690.
[50] P. W. Koh and P. Liang. 2017. Understanding Black-box Predictions via Influence Functions. In International Conference
on Machine Learning (ICML).
[51] Aleksander Kolcz and Choon Hui Teo. 2009. Feature Weighting for Improved Classifier Robustness. In Sixth Conference
on Email and Anti-Spam (CEAS). Mountain View, CA, USA.
[52] Bojan Kolosnjaji, Ambra Demontis, Battista Biggio, Davide Maiorca, Giorgio Giacinto, Claudia Eckert, and Fabio Roli.
2018. Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables. In Proceedings of
the 26th European Signal Processing Conference.
[53] Sogeti ESEC Lab. 2015. Origami Framework. (2015). http://esec-lab.sogeti.com/pages/origami.html
[54] Pavel Laskov and Nedim Šrndić. 2011. Static Detection of Malicious JavaScript-bearing PDF Documents. In Proceedings
of the 27th Annual Computer Security Applications Conference (ACSAC ’11). ACM, New York, NY, USA, 373ś382.
https://doi.org/10.1145/2076732.2076785
[55] Daiping Liu, Haining Wang, and Angelos Stavrou. 2014. Detecting Malicious Javascript in PDF Through Document
Instrumentation. In Proceedings of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN ’14). IEEE Computer Society, Washington, DC, USA, 100ś111. https://doi.org/10.1109/DSN.2014.92
[56] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017. Delving into Transferable Adversarial Examples and
Black-box Attacks. In ICLR.
[57] Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. 2018. Trojaning
Attack on Neural Networks. In 25th Annual Network and Distributed System Security Symposium (NDSS). The Internet
Society.
[58] Daniel Lowd and Christopher Meek. 2005. Adversarial Learning. In Proc. 11th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (KDD). ACM Press, Chicago, IL, USA, 641ś647.
, Vol. 1, No. 1, Article . Publication date: April 2019.
:34
D. Maiorca et al.
[59] X. Lu, J. Zhuge, R. Wang, Y. Cao, and Y. Chen. 2013. De-obfuscation and Detection of Malicious PDF Files with High
Accuracy. In 2013 46th Hawaii International Conference on System Sciences. 4890ś4899. https://doi.org/10.1109/HICSS.
2013.166
[60] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. 2018. Towards Deep Learning Models Resistant to
Adversarial Attacks. In ICLR.
[61] Davide Maiorca, Davide Ariu, Igino Corona, and Giorgio Giacinto. 2015. An Evasion Resilient Approach to the
Detection of Malicious PDF Files. In Information Systems Security and Privacy, Olivier Camp, Edgar Weippl, Christophe
Bidan, and Esma Aïmeur (Eds.). Springer International Publishing, Cham, 68ś85.
[62] Davide Maiorca, Davide Ariu, Igino Corona, and Giorgio Giacinto. 2015. A Structural and Content-based Approach
for a Precise and Robust Detection of Malicious PDF Files. In ICISSP 2015 - Proceedings of the 1st International
Conference on Information Systems Security and Privacy, ESEO, Angers, Loire Valley, France, 9-11 February, 2015. 27ś36.
https://doi.org/10.5220/0005264400270036
[63] Davide Maiorca and Battista Biggio. 2019. Digital Investigation of PDF Files: Unveiling Traces of Embedded Malware.
IEEE Security Privacy 17, 1 (Jan 2019), 63ś71. https://doi.org/10.1109/MSEC.2018.2875879
[64] Davide Maiorca, Igino Corona, and Giorgio Giacinto. 2013. Looking at the Bag is Not Enough to Find the Bomb: An
Evasion of Structural Methods for Malicious PDF Files Detection. In Proceedings of the 8th ACM SIGSAC Symposium
on Information, Computer and Communications Security (ASIA CCS ’13). ACM, New York, NY, USA, 119ś130. https:
//doi.org/10.1145/2484313.2484327
[65] Davide Maiorca, Giorgio Giacinto, and Igino Corona. 2012. A Pattern Recognition System for Malicious PDF Files
Detection. In Machine Learning and Data Mining in Pattern Recognition - 8th International Conference, MLDM 2012,
Berlin, Germany, July 13-20, 2012. Proceedings. 510ś524. https://doi.org/10.1007/978-3-642-31537-4_40
[66] Shike Mei and Xiaojin Zhu. 2015. Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine
Learners. In 29th AAAI Conf. Artificial Intelligence (AAAI ’15).
[67] Marco Melis, Ambra Demontis, Battista Biggio, Gavin Brown, Giorgio Fumera, and Fabio Roli. 2017. Is Deep Learning
Safe for Robot Vision? Adversarial Examples against the iCub Humanoid. In ICCVW Vision in Practice on Autonomous
Robots (ViPAR). IEEE, 751ś759.
[68] Marco Melis, Davide Maiorca, Battista Biggio, Giorgio Giacinto, and Fabio Roli. 2018. Explaining Black-box Android
Malware Detection. In 26th European Signal Processing Conf. (EUSIPCO). IEEE, IEEE, Rome, Italy, 524ś528.
[69] Luis Muñoz-González, Battista Biggio, Ambra Demontis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu, and
Fabio Roli. 2017. Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization. In 10th ACM
Workshop on Artificial Intelligence and Security (AISec ’17), Bhavani M. Thuraisingham, Battista Biggio, David Mandell
Freeman, Brad Miller, and Arunesh Sinha (Eds.). ACM, New York, NY, USA, 27ś38.
[70] Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Machine Learning with Membership Privacy using Adversarial
Regularization. In ACM Conference on Computer and Communications Security. ACM, 634ś646.
[71] Blaine Nelson, Marco Barreno, Fuching Jack Chi, Anthony D. Joseph, Benjamin I. P. Rubinstein, Udam Saini, Charles
Sutton, J. D. Tygar, and Kai Xia. 2008. Exploiting machine learning to subvert your spam filter. In LEET’08: Proceedings
of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats. USENIX Association, Berkeley, CA, USA,
1ś9.
[72] Nir Nissim, Aviad Cohen, Chanan Glezer, and Yuval Elovici. 2015. Detection of malicious PDF files and directions for
enhancements: A state-of-the art survey. Computers & Security 48 (2015), 246ś266.
[73] Sentinel One. 2018. SentinelOne Detects New Malicious PDF File. (2018). https://www.sentinelone.com/blog/
sentinelone-detects-new-malicious-pdf-file/
[74] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. 2017.
Practical Black-Box Attacks Against Machine Learning. In Proceedings of the 2017 ACM on Asia Conference on Computer
and Communications Security (ASIA CCS ’17). ACM, New York, NY, USA, 506ś519. https://doi.org/10.1145/3052973.
3053009
[75] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The
Limitations of Deep Learning in Adversarial Settings. In Proc. 1st IEEE European Symposium on Security and Privacy.
IEEE, 372ś387.
[76] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. łWhy Should I Trust You?ž: Explaining the Predictions
of Any Classifier. In 22nd ACM SIGKDD Int’l Conf. Knowl. Disc. Data Mining (KDD ’16). ACM, New York, NY, USA,
1135ś1144.
[77] S. Rota Bulò, B. Biggio, I. Pillai, M. Pelillo, and F. Roli. 2017. Randomized Prediction Games for Adversarial Machine
Learning. IEEE Transactions on Neural Networks and Learning Systems 28, 11 (2017), 2466ś2478.
[78] F. Schmitt, J. Gassen, and E. Gerhards-Padilla. 2012. PDF Scrutinizer: Detecting JavaScript-based attacks in PDF
documents. In 2012 Tenth Annual International Conference on Privacy, Security and Trust(PST), Vol. 00. 104ś111.
https://doi.org/10.1109/PST.2012.6297926
, Vol. 1, No. 1, Article . Publication date: April 2019.
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
:35
[79] Offensive Security. 2018. Exploit Database. (2018). https://www.exploit-db.com/
[80] M. Zubair Shafiq, Syed Ali Khayam, and Muddassar Farooq. 2008. Embedded Malware Detection Using Markov
n-Grams. In Proceedings of the 5th International Conference on Detection of Intrusions and Malware, and Vulnerability
Assessment (DIMVA ’08). Springer-Verlag, Berlin, Heidelberg, 88ś107. https://doi.org/10.1007/978-3-540-70542-0_5
[81] R. Shokri, M. Stronati, C. Song, and V. Shmatikov. 2017. Membership Inference Attacks Against Machine Learning
Models. In 2017 IEEE Symposium on Security and Privacy (SP). 3ś18.
[82] Charles Smutz and Angelos Stavrou. 2012. Malicious PDF Detection Using Metadata and Structural Features. In
Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC ’12). ACM, New York, NY, USA,
239ś248. https://doi.org/10.1145/2420950.2420987
[83] Charles Smutz and Angelos Stavrou. 2016. When a Tree Falls: Using Diversity in Ensemble Classifiers to Identify
Evasion in Malware Detectors. In 23rd Annual Network and Distributed System Security Symposium, NDSS 2016, San
Diego, California, USA, February 21-24, 2016. http://wp.internetsociety.org/ndss/wp-content/uploads/sites/25/2017/
09/when-tree-falls-using-diversity-ensemble-classifiers-identify-evasion-malware-detectors.pdf
[84] Kevin Z. Snow, Srinivas Krishnan, Fabian Monrose, and Niels Provos. 2011. SHELLOS: Enabling Fast Detection
and Forensic Analysis of Code Injection Attacks. In Proceedings of the 20th USENIX Conference on Security (SEC’11).
USENIX Association, Berkeley, CA, USA, 9ś9. http://dl.acm.org/citation.cfm?id=2028067.2028076
[85] Nedim Šrndić and Pavel Laskov. 2016. Hidost: a static machine-learning-based detector of malicious files. EURASIP
Journal on Information Security 2016, 1 (26 Sep 2016), 22. https://doi.org/10.1186/s13635-016-0045-0
[86] Didier Stevens. 2008. PDF Tools. (2008). Http://blog.didierstevens.com/programs/pdf-tools
[87] Symantec. 2018. Internet Security Threat Report (Vol. 23).
[88] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus.
2014. Intriguing properties of neural networks. In International Conference on Learning Representations. http:
//arxiv.org/abs/1312.6199
[89] S. Momina Tabish, M. Zubair Shafiq, and Muddassar Farooq. 2009. Malware detection using statistical analysis of
byte-level file content. In Proc. of the ACM SIGKDD Work. on CyberSecurity and Intelligence Informatics.
[90] Trevor Tonn and Kiran Bandla. 2013. PhoneyPDF. (2013). https://github.com/kbandla/phoneypdf
[91] Malware Tracker. 2018. PDF Current Threats. (2018). https://www.malwaretracker.com/pdfthreat.php
[92] Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. 2016. Stealing Machine Learning
Models via Prediction APIs. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin,
TX, 601ś618.
[93] Zacharias Tzermias, Giorgos Sykiotakis, Michalis Polychronakis, and Evangelos P. Markatos. 2011. Combining Static
and Dynamic Analysis for the Detection of Malicious Documents. In Proceedings of the Fourth European Workshop on
System Security (EUROSEC ’11). ACM, New York, NY, USA, Article 4, 6 pages. https://doi.org/10.1145/1972551.1972555
[94] Cristina Vatamanu, Dragoş Gavriluţ, and Răzvan Benchea. 2012. A Practical Approach on Clustering Malicious PDF
Documents. J. Comput. Virol. 8, 4 (Nov. 2012), 151ś163. https://doi.org/10.1007/s11416-012-0166-z
[95] Nedim Šrndić and Pavel Laskov. 2013. Detection of Malicious PDF Files Based on Hierarchical Document Structure.
In Proc. of the 20th Annual Network & Distributed System Security Symp.
[96] Nedim Šrndic and Pavel Laskov. 2014. Practical Evasion of a Learning-Based Classifier: A Case Study. In Proc.
of the 2014 IEEE Symp. on Security and Privacy (SP ’14). IEEE Computer Society, Washington, DC, USA, 197ś211.
https://doi.org/10.1109/SP.2014.20
[97] Nedim Šrndic and Pavel Laskov. 2014. Practical Evasion of a Learning-Based Classifier: A Case Study. In Proc. 2014
IEEE Symp. Security and Privacy (SP ’14). IEEE CS, Washington, DC, USA, 197ś211.
[98] VulDB. 2018. The Crowd-Based Vulnerability Database. (2018). https://vuldb.com
[99] Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia, Xinyu Xing, Xue Liu, and C. Lee Giles. 2017. In
KDD 2017 - Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Vol. Part F129685. Association for Computing Machinery, 1145ś1153. https://doi.org/10.1145/3097983.3098158
[100] Carsten Willems, Felix C. Freiling, and Thorsten Holz. 2012. Using Memory Management to Detect and Extract
Illegitimate Code for Malware Analysis. In Proceedings of the 28th Annual Computer Security Applications Conference
(ACSAC ’12). ACM, New York, NY, USA, 179ś188. https://doi.org/10.1145/2420950.2420979
[101] Meng Xu and Taesoo Kim. 2017. PlatPal: Detecting Malicious Documents with Platform Diversity. In 26th USENIX
Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 271ś287. https://www.usenix.org/
conference/usenixsecurity17/technical-sessions/presentation/xu-meng
[102] Weilin Xu, Yanjun Qi, and David Evans. 2016. Automatically Evading Classifiers: A Case Study on PDF Malware
Classifiers. In NDSS. The Internet Society.
[103] Yanfang Ye, Tao Li, Donald Adjeroh, and S. Sitharama Iyengar. 2017. A Survey on Malware Detection Using Data
Mining Techniques. ACM Comput. Surv. 50, 3, Article 41 (June 2017), 40 pages. https://doi.org/10.1145/3073559
, Vol. 1, No. 1, Article . Publication date: April 2019.