Towards A Generic Approach For Memory Forensics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Towards a Generic Approach for Memory Forensics

Ethar Qawasmeh∗ , Mohammed I. Al-Saleh∗† , and Ziad A. Al-Sharif∗


∗ Jordan University of Science and Technology
† Higher Colleges of Technology, Computer
{misaleh,zasharif}@just.edu.jo, [email protected]

Abstract—The era of information technology has, unfortu- network data. Therefore, the result of applying MF is invaluable
nately, contributed to the tremendous rise in the number of in cases that do not easily leave artifacts on the hard drive,
criminal activities. However, digital artifacts can be utilized in
where the only source a piece of information can be found in is
convicting cybercriminal and exposing their activities. The digital
memory. For example, a running process might generate data
forensics science concerns about all aspects related to cybercrimes.
It seeks digital evidence by following standard methodologies to that will never be stored on a secondary storage. In cybercrimes,
be admitted in court rooms. This paper concerns about memory many different attacks might involve memory-only information
forensics for the unique artifacts it holds. Memory contains [4, 5, 6]. Viewing web pages that contain some entities such as
information about the current state of systems and applications. pictures, audios, videos, ads, etc. might involve files might not
Moreover, an application’s data explains how a criminal has been
interacting the application just before the memory is acquired. be stored on the HD at all, but stay in memory. This requires
Memory forensics at the application level is currently random investigation at the memory-level.
and cumbersome. Targeting specific applications is what forensic Application level evidence gives an indication of how the
researchers and practitioner are currently striving to provide. user is using an application at the time of memory image is
This paper suggests a general solution to investigate any applica-
acquired. Due to their importance, certain applications attract
tion. Our solution aims to utilize an application’s data structures
and variables’ information in the investigation process. This is digital investigators. For example, Web browsers are among
the main applications that investigators are greatly interested in
because an application’s data has to be stored and retrieved in the
means of variables. Data structures and variables’ information inspecting their associated information such as visited URLs
can be generated by compilers for debugging purposes. We show and search queries Said et al. [7]. In addition, the most common
that an application’s information is a valuable resource to the
used Windows applications; Microsoft Word 2007, Microsoft
investigator.
Index Terms—Memory forensics, PDB file, Digital Evidence, Excel 2007, Adobe Reader 9.0, Microsoft PowerPoint 2007
Debugging information, Application’s data and Internet Explorer 7.0 were analyzed to inspect the user
information fragments and the viewed web pages that could
I. I NTRODUCTION be extracted from various areas in memoryOlajide et al. [8].
Nevertheless, different artifacts can be extracted from the
Recent technologies of computation and communication in memory of a running program. For example, the program’s
digital devices have led to a significant increase in the number state and execution behavior can be explored by utilizing the
of cybercrimes. Intruding into others’ machines to steal their source code and its Object-Oriented programming structure
valuable information, executing malicious programs, spying on [9, 10]. Variables’ values of a program varied in their scopes
users’ activities or causing damage to systems are examples of and durations which indicates the program’s states.
cybercrimes. Digital Forensic (DF) is a discipline that helps This paper is organized as follows. Related work is covered
investigators extract digital evidence from digital devices [1]. in Section II. Our investigation model is presented in Section III.
Various digital storage sources can be inspected to find digital This is followed by Section IV that explains our experimental
evidence such as Hard Drives (HD), Solid State Drives (SSD), setup. Our results are shown in Section V. A discussion
Random Access Memory (RAM), network, phone SIM cards and future work are covered in Section VI followed by the
and Internet-of-Things devices. As identifying evidence of conclusion.
physical crime, digital forensics can be utilized to attribute
evidence and hold the suspect accountable, confirm alibis II. R ELATED W ORK
or statements, evaluation of source (copyright materials or
document authentication) or determine the intent Casey [2]. The main objective of memory forensics is to analyze
Memory Forensics (MF) is one of the most effective digital volatile data, extract digital artifacts, and identify malicious
forensic disciplines. It aims to extract digital evidence from code from the relevant suspicious programs. Several significant
volatile data existed in the RAM. Furthermore, MF plays studies aimed to improve the analysis of the acquired memory,
an important role in incident response and malware analysis especially after launching the memory challenge in the Digital
and their reverse engineering process that can be utilized to Forensics Research Workshop (DFRWS, 2005) [11]. Dolan-
inspect suspected systems and their memories [3]. Wealth Gavitt [12], in 2007, proposed the use of Virtual Address
of information can be extracted from memory such as files, Descriptor (VAD) tree structure in windows that help in the
processes, registry keys, passwords, encryption keys, and analysis of memory dumps. The description of how to locate

978-1-7281-5061-1/19/$31.00 ©2019 IEEE 094


and parse VAD structure is proposed which can be obtained
by walking the page directory for the process.
In 2011, Okolica and Peterson [13] proposed a technique
for extracting contents from memory. They presented methods
to find the data from user and kernel modes by leveraging the
PDB files from Microsoft’s symbol server to resolve user32.dll
and win32k.sys, respectively. Their results showed that the
proposed algorithm was able to retrieve copy/paste information
from various Windows versions (XP, Vista, and Windows 7
both 32 bit and 64 bit) memory dumps using the data from
several applications including Notepad, Microsoft Word, and
Microsoft Excel. In 2014, Cohen and Metz [14] introduced
the implementation to parse PDB files and calculate kernel
symbol addresses as a plugin named mspdb parser into Rekall
memory analysis framework [15].
Al-Saleh and Al-Sharif [16] in 2012 proposed an empirical
study that might help the DF investigators in discovering
cybercriminals by utilizing data left in TCP buffers. Their
results showed that as long as the TCP connection still alive,
the sent or received data by the cybercriminal machine could
be found on both Windows or Linux. Their work shed light on
utilizing memory forensics to inspect such artifacts associated
with online systems that used TCP connection in order to serve
DF investigations.
Furthermore, in 2013, Ohana and Shashidhar [17] explored
memory artifacts from private and portable web browsing
sessions. The artifacts must contain enough file fragments to Fig. 1: Investigation Model
establish a correct link between user and session. They tested
their experiments against five major web browsers; Internet
explorer Google Chrome, Mozilla Firefox, and Apple safari. extracted from memory to prove that the document was viewed
In 2014, Al-Khaleel et al. [18] examined memory artifacts or edited by the user. Recently, in 2019, Al-Saleh et al. [22]
of the Tor bundle. They performed different experiments to studied the impact of network reconnaissance detection using
check what artifacts might the Tor browser leave in memory. memory forensic approaches. They utilized the information that
The results showed that Tor deleted all in-memory valuable can be recovered from memory. Furthermore, they observed
information. Therefore, Tor users can surf the Internet and that the sending or receiving packets through the network can
enjoy Tor privacy. be stayed in the memory for a while. Their results showed
Cohen [19] in 2015 studied the characterization of different the investigation of memory was effective for detecting the
Windows kernel versions and their impacts in memory analysis. artifacts of attack.
This can be done by collecting large number of Windows
kernel binary (ntkrnlmp.exe, ntkrpamp.exe and ntoskrnl.exe) and III. I NVESTIGATION M ODEL
win32k.sys GUI subsystem then download the corresponding In this paper, we explain the procedure that we follow to
PDB file from Microsoft symbol server. In particular, their achieve investigation model as shown in Figure 1. In the first
findings showed that the struct layout was stable across major phase of our model, preparation, an application is develop and
kernel versions and the kernel global offsets were varied compiled in order to get the executable file and to require the
extremely with version. Memory analysis had been successfully debugging information of the application we want to investigate.
utilized to detect malware and signature scanning. Specifically, In Visual Studio, the debugging information can be produced
the Yara had been developed as an actual standard of malware when choosing Debug mode during the compilation process. It
signature for files. In 2017, Cohen [20] presented the difference is stored in Program Database (PDB) files (or called Symbol
between applying Yara signatures on files and on the memory files). A memory dump of the target computer has to be taken
level. Luckily, they developed a Yara scanning engine which in the imaging phase . The memory dump is acquired while
can scan all processes simultaneously in an efficient manner. the target application is running. We implemented our model
Various digital evidences can be extracted from memory as a plugin for the Volatility open source memory analysis
in relation to MS Word documents. Al-Sharif et al. [21], in framework [23]. In the parsing phase, the PDB file is parsed
2018, proposed a memory forensic approach that utilizing the in order to extract all variable names (global and local) along
XML representation which used internally by MS Office. Their with their relatively virtual addresses (RVA) and all primitive
results showed that different portions of MS Word can be and complex data types. We have used the official Microsoft-

978-1-7281-5061-1/19/$31.00 ©2019 IEEE 095


pdb parser [24] since it produces debugging information along Furthermore, global variables can be parsed from global
with many options to display the parsed information in an stream in PDB file.
organized manner. In the inspecting phase, our plugin utilizes • Struct and struct members. This structure holds informa-
the Volatility capabilities to dump the virtual address space of tion about memory offsets for struct members, names, and
the target application. The further step in inspecting phase is their data types. This is useful in order to interpret the
the mapping between the parsed symbols from PDB file inside contents of memory properly.
the application’s memory. The results of mapping variable • Function addresses and local variables. The location of
value have to be reported in the reporting phase. functions, their addresses, sizes, arguments and local
variables also provided in PDB file.
IV. E XPERIMENTAL S ETUP • Enumeration. This structure is a way to represent one of
We designed our experiments to validate our proposed a set of choices using an integer. The mapping between
investigation model. The experiment is for a program we integer value and the given string can be found in PDB
wrote ourselves. It contains all the structures we want to file.
check our framework against. We use Oracle’s VirtualBox’s
Figure 3 shows the example of declaring primitive data type
virtual machine (VM) [25] for experimentation and memory
variables such as integer and double data types. In the parsing
acquisition. VirtualBox has built-in features that enable us to
results, the global variable information can be extracted from
capture bit-by-bit copies of the memory of a virtual machine
global stream in PDB file as shown in step A. The relative
while it is running. Figure 2 illustrates the specifications of
virtual address of the variable is placed between square brackets.
both VM and host machine.
The first term indicates the section header number in the PE file
To check the validity of our investigation model, we designed
(.data) where the global variables are stored. By parsing
a Test Application (TA) using Visual Studio C++ program that
the section header stream from PDB file, we can obtain the
contains various important structures. We proceed with the
virtual address of the target section header number as shown
following procedure:
in step B of parsing. The second term in square brackets is the
• The program is compiled with the Debug mode. variable offset to the address of (.data) section. Moreover,
• The program’s PDB file is extracted for further investiga- the target address of a variable can be obtained by combining
tion. the variable offset, with the VA of the (.data) section and
• The program is executed and paused at some point of with the base address of the application.
execution. Figure 4 shows the global struct variable definition as an
• The RAM is captured into a dump file while the program
example of complex data structures. We have defined the struct
is running. data structure along with four members (the array of character,
• Our investigation model is executed while giving the RAM
integer, double and float). The data members of the struct
dump and the PDB file as inputs. variable are filled with values as the example of non-primitive
V. R ESULTS data type. As in the previous example, the global information
of variables can be parsed from global stream in PDB file in
In this section, we present our results for the experiments parsing results step A and parsing PE file information as well in
that discussed in Section IV. The PDB file contains a number step B. While the primitive data types have a specific identifier
of useful pieces of information that can be inspected by our (int, double, ... etc) that reside fixed size in memory location,
model investigation: the non-primitive data types are complex data structure that
• Global variables. In each application there are some global
variable that can be accessed among all running threads.
Global variables are resided in the Portable Excecutable
(PE) file, particularly in (.data) section in memory.

Fig. 2: Experimental setup Fig. 3: Parsing results of primitive data types

978-1-7281-5061-1/19/$31.00 ©2019 IEEE 096


VII. C ONCLUSION
As cybercrimes notably increase, forensic tools have to adapt
to keep up with this challenge. Current forensic techniques
investigate applications mostly individually. When applications
are newly developed or updated, new forensic techniques are
required to evolve to handle the situation. This paper proposed
a novel technique that overcomes the diversity of applications
in a unified solution. We suggest utilizing the information about
application’s data structures and variables in the investigation
process. Our results show that our approach is promising.

R EFERENCES
[1] M. Reith, C. Carr, and G. Gunsch, “An examination of
digital forensic models,” International Journal of Digital
Evidence, vol. 1, no. 3, pp. 1–12, 2002.
[2] E. Casey, Handbook of digital forensics and investigation.
Fig. 4: Parsing results of a struct variable Academic Press, 2009.
[3] A. Schuster, “Searching for processes and threads in
microsoft windows memory dumps,” digital investigation,
we cannot know the size of their memory location before vol. 3, pp. 10–16, 2006.
parsing their information. Therefore, in parsing phase step A, [4] M. Al-Saleh and Z. Al-Sharif, “Ram forensics against
the parsing information from PDB file of the non-primitive cyber crimes involving files,” in The Second International
variable shows that the type of the variable is given by the Conference on Cyber Security, Cyber Peacefare and
reference address. Thus, we have to analyze the reference Digital Forensic (CyberSec2013), 2013, pp. 189–197.
address (0x114e) from PDB file to get the full information [5] Z. A. Al-Sharif, M. I. Al-Saleh, Y. Jararweh, L. Alawneh,
of the variable. The parsing results shows that the reference and A. S. Shatnawi, “The effects of platforms and
address(0x114e) indicates that this variable is identified from languages on the memory footprint of the executable
struct data structure (LF_STTRUCTURE). To reach the struct program: A memory forensic approach,” Journal of
members, we have to analyze the reference address (0x114d) Universal Computer Science, vol. 25, no. 9, pp. 1174–
that indicates the field list type term. By analyzing the field list 1198, sep 2019.
type reference address (0x114d), we have full information of [6] Z. Al-Sharif, D. Odeh, and M. Al-Saleh, “Towards carving
struct members that the variable has. It’s important to note that pdf files in the main memory,” in The International
the parsing information of struct members has the offset of Technology Management Conference (ITMC2015), 2015,
each defined member. Furthermore, The target address of each pp. 24–31.
struct member is obtained by combining the struct member [7] H. Said, N. Al Mutawa, I. Al Awadhi, and M. Guimaraes,
offset, with the variable offset, with the VA of .data section “Forensic analysis of private browsing artifacts,” in 2011
and with the base address of the running application. International Conference on Innovations in Information
Technology. IEEE, 2011, pp. 197–202.
VI. D ISCUSSION AND F UTURE W ORK [8] F. Olajide, N. Savage et al., “Application level evidence
In this paper, we examined the usage of application’s from volatile memory,” Journal of Computing in Systems
debugging information in memory forensic approach. The most and Engineering, vol. 10, pp. 171–175, 2009.
common limitation to any memory forensic research is the [9] Z. A. Al-Sharif, M. I. Al-Saleh, and L. Alawneh, “Towards
volatile feature of memory; data vanishes when a device is the memory forensics of oop execution behavior,” in 2017
turned off. However, this does not stop investigators from 8th International Conference on Information, Intelligence,
leveraging the invaluable information that resides in memory Systems & Applications (IISA). IEEE, 2017, pp. 1–6.
in case a device is found to be running at the acquisition time. [10] Z. A. Al-Sharif, M. I. Al-Saleh, L. M. Alawneh, Y. I.
In our approach, the debugging information of applications Jararweh, and B. Gupta, “Live forensics of software
must be provided. Some application developers might not attacks on cyber–physical systems,” Future Generation
cooperate to provide such information. In these cases, current Computer Systems, 2018.
forensic techniques should be utilized. Furthermore, as a run- [11] “Digital forensic research workshop. dfrws memory analy-
ning example, we investigated c++ applications. Experimenting sis challenge,2005,” http://old.dfrws.org/2005/index.shtml,
with the more complex data structures that might be developed accessed: 2019-09-23.
by any application is our future work. In addition, testing our [12] B. Dolan-Gavitt, “The vad tree: A process-eye view of
investigation model with a real world application is a future physical memory,” digital investigation, vol. 4, pp. 62–64,
direction. 2007.

978-1-7281-5061-1/19/$31.00 ©2019 IEEE 097


[13] J. Okolica and G. L. Peterson, “Extracting the windows
clipboard from physical memory,” digital investigation,
vol. 8, pp. S118–S124, 2011.
[14] M. Cohen and J. Metz, “Ms pdb parser,”
https://github.com/google/rekall/commit/
89f4f2832d99eac3b783b02ce9025806eaca6bd8, January
2014.
[15] G. Inc, “Rekall memory forensic framework,” http://www.
rekall-forensic.com/, 2017.
[16] M. I. Al-Saleh and Z. A. Al-Sharif, “Utilizing data lifetime
of tcp buffers in digital forensics: Empirical study,” Digital
Investigation, vol. 9, no. 2, pp. 119–124, 2012.
[17] D. J. Ohana and N. Shashidhar, “Do private and portable
web browsers leave incriminating evidence?: a forensic
analysis of residual artifacts from private and portable
web browsing sessions,” EURASIP Journal on Information
Security, vol. 2013, no. 1, p. 6, 2013.
[18] A. Al-Khaleel, D. Bani-Salameh, and M. I. Al-Saleh, “On
the memory artifacts of the tor browser bundle,” in The
International Conference on Computing Technology and
Information Management (ICCTIM). Society of Digital
Information and Wireless Communication, 2014, p. 41.
[19] M. I. Cohen, “Characterization of the windows kernel
version variability for accurate memory analysis,” Digital
Investigation, vol. 12, pp. S38–S49, 2015.
[20] M. Cohen, “Scanning memory with yara,” Digital Inves-
tigation, vol. 20, pp. 34–43, 2017.
[21] Z. A. Al-Sharif, H. Bagci, A. Asad et al., “Towards the
memory forensics of ms word documents,” in Information
Technology-New Generations. Springer, 2018, pp. 179–
185.
[22] M. I. Al-Saleh, Z. A. Al-Sharif, and L. Alawneh, “Net-
work reconnaissance investigation: A memory forensics
approach,” in 2019 10th International Conference on
Information and Communication Systems (ICICS). IEEE,
2019, pp. 36–40.
[23] A. Walters, “The volatility framework: Volatile memory
artifact extraction utility framework,” 2007.
[24] Microsoft, “microsoft-pdb,” https://github.com/Microsoft/
microsoft-pdb.
[25] Oracle, “Virtualbox,” https://www.virtualbox.org/.

978-1-7281-5061-1/19/$31.00 ©2019 IEEE 098

You might also like