Project Maelstrom Forensic Analysis of T

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Left Center Right

PROJECT MAELSTROM: FORENSIC ANALYSIS OF


THE BITTORRENT-POWERED BROWSER
Jason Farina, M-Tahar Kechadi, Mark Scanlon
School of Computer Science, University College Dublin, Ireland.
[email protected], {tahar.kechadi, mark.scanlon}@ucd.ie

ABSTRACT
In April 2015, BitTorrent Inc. released their distributed peer-to-peer powered browser, Project Maelstrom,
into public beta. The browser facilitates a new alternative website distribution paradigm to the traditional
HTTP-based, client-server model. This decentralised web is powered by each of the visitors accessing each
Maelstrom hosted website. Each user shares their copy of the website;s source code and multimedia content
with new visitors. As a result, a Maelstrom hosted website cannot be taken offline by law enforcement or any
other parties. Due to this open distribution model, a number of interesting censorship, security and privacy
considerations are raised. This paper explores the application, its protocol, sharing Maelstrom content and its
new visitor powered “web-hosting” paradigm.

Keywords: Project Maelstrom, BitTorrent, Decentralised Web, Alternative Web, Browser Forensics

1. INTRODUCTION • File Synchronisation Service – BitTorrent Sync


is a cloudless alternative to the cloud-based,
Project Maelstrom was released as a private alpha in multiple-device, file synchronisation services
December 2014 [Klinker, 2014] and as a public beta such as Dropbox, OneDrive, iCloud, etc. [Scan-
in April 2015 [Klinker, 2015]. Its purpose is to pro- lon et al., 2015]. With a standard BitTorrent
vide a decentralised web ecosystem facilitating a new Sync install, users are not limited in the amount
parallel to the existing world wide web. Through this of data they can share as there are no replicated
decentralisation, users are free to create and share any server-side limitations.
content they desire without the need for web hosting • Cost-Effective Commercial Content Distribution
providers or domain names, and can bypass any na- – BitTorrent Inc. use the protocol to distribute
tional or international censorship. The HTML docu- commercial multimedia content through their
ments, associated styling and scripting files, multime- “BitTorrent Bundle” offering. Large video game
dia content, etc., are hosted by the website’s visitors, creators and distribution companies, Blizzard
and subsequently served to other visitors accessing Entertainment and Valve have used the BitTor-
the site. The official BitTorrent blog celebrated its rent protocol to distribute installation files and
arrival as: software patches to their users [Watters et al.,
Truly an Internet powered by people, one 2011]. The advantage for these companies is that
that lowers barriers and denies gatekeepers the protocol excels where the traditional client-
their grip on our future [Klinker, 2014]. server model starts to fail; the more users, the
While the topic of BitTorrent in the media seems
faster the average download speed.
to predominantly coincide with a discussion on on-
line piracy [Choi and Perez, 2007], the protocol has
proven itself as a robust, low-cost, distributed alter-
1.1 Contribution of this Work
native to the traditional client-server content distri- The contribution of this work can be summarised as:
bution model. From a forensic standpoint, the de-
centralised nature of BitTorrent based applications • An overview and analysis of Project Maelstrom’s
can result in extended windows for evidence acqui- functionality.
sition [Scanlon et al., 2014]. Aside from facilitating • An analysis of the forensic value of its installa-
copyright infringement, a number of additional on- tion and configuration files.
line services have been developed using the protocol • An investigation of evidence left behind after
including: uninstallation and what knowledge be recovered.

1
LeftChandlor000033Bio10/11 Center Right
Left Center Right

2. BACKGROUND READING would normally make any form of dependable routing


impossible as any DNS lookup would not be able to
2.1 Peer-to-Peer Facilitated Web report a static IP address for the service for the client
Browsing system to connect to. Instead, hidden services use a
distributed hash table (DHT) and a known service
The deep web refers to layers of Internet services and
advertising node to inform clients of their presence
communication that is not readily accessible to most
and the path to the service provided.
users and is not crawlable by traditional search en-
Due to the reliance of the HS on DHT and directory
gines. Unlike the regular Internet, there is not one set
servers to perform the introductions for Tor users,
of protocols or formats for the deep web, instead deep
Biryukov et al. [2013] were able to demonstrate a de-
web is a generic term used to describe Internet com-
nial of service attack on a HS by impersonating the
munications that are managed using closed or some-
directory servers. They were also able to crawl the
how restricted protocols. More recently however, the
DHT to harvest Onion Identifiers over a period of two
term “deep web” has been made synonymous with
days, resulting in an accurate index of the content of
black-market sites such as “The Silk Road”, taken of-
the deep web contained within Tor. This approach
fline by the FBI in 2013. While BitTorrent Inc. do
could be useful when attempting to stall a HS until
not claim that Project Maelstrom enhances privacy
it can be properly identified.
or anonymises a user’s Internet traffic, many of the
competing decentralised deep web technologies focus 2.1.3 I2P
their efforts on precisely that. The Invisible Internet Project (I2P) is often consid-
2.1.1 The Freenet Project ered another anonymisation utility like TOR. How-
ever, its garlic routing protocol is intended to create
The Freenet project is a peer based distributed In- an alternative Internet. In this model, all traffic takes
ternet alternative. Users connect to Freenet through place via unidirectional tunnels established out to a
an installed application that uses multiple encrypted resource which responds with its own, different, tun-
connections to mask the identifying network informa- nelled return path. In this way all traffic to and from
tion and the data location. Freenet peers, or nodes, a host is protected by a separate session based tunnel
store fragments of data in a distributed fashion. The that closely resembles that of TOR, but I2P was not
number of times a data item is replicated is depen- intended to allow access to sites or services outside
dent on the demand for that data. More popular files of the I2P network itself. In this regard, I2P more
have more available sources resulting in better avail- closely resembles Tor Hidden Services. In I2P, the
ability and faster access times for the requester. The traffic is encrypted end-to-end and the encryption in
layered encryption of the connections provides an ef- either direction is handled separately [Zantout and
fective defence against network sniffing attacks and Haraty, 2011].
also complicates network forensic analysis. Freenet,
by default, is deployed as an OpenNet service. On
OpenNet, it is possible to enumerate and connect to 3. PROJECT MAELSTROM:
all available nodes and perform traffic correlation to APPLICATION ANALYSIS
trace back to the original requester [Roos et al., 2014].
Project Maelstrom is built upon the open source ver-
2.1.2 Tor
sion of Google’s Chrome browser, Chromium, as can
Tor (The Onion Router) is a networking protocol de- be seen in Figure 1.
signed to provide anonymity while accessing both reg-
ular websites, and those hosted within the Tor net- 3.1 Installation
work. Initially, Tor provided random internal en- The version of Project Maelstrom analysed as part of
crypted routes enabling anonymised regular website this paper has the following properties
access for its users. This was achieved through a min- File: Maelstrom.exe
imum of three encrypted connections before exiting to Size: 36,971KB
the regular Internet through an exit node. MD5: d3b6560c997a37a1359721fdaa25925f
A user can opt not to utilise an exit node and in- SHA1: 9de7c48e324b2ac9240ca168b7c2afd46bd1c799
stead browse content on services hosted within the Origin: download-lb.utorrent.com
Tor network itself. These Tor websites are known as version: 37.0.2.1
Hidden Services (HS) and the identity and geoloca- All testing was carried out on a virtual machine
tion of the hosting server are obfuscated, in a similar running Windows 7 with 1GB ram and a thin provi-
fashion to the website’s visitors. This anonymisation sioned 60gb hard drive.

2
LeftChandlor000034Bio10/11 Center Right
Left Center Right

check is triggered when the local client sends a GET


/windows/latest.json request and, if an update is
required the response contains a URL http://
update.browser.bittorrent.com/windows/
<latestversionnumber>/mini_installer.exe.
This secondary installer silently deploys a new
folder alongside the original version with the new
version number and the preferences and path files
are updated accordingly.
As part of the installation procedure Project
Maelstrom installs the following executables to the
\Application\ directory:
maelstrom.exe (desktop shortcut target)
MD5: 44d2641129cc922c5bc3545db8c8cda7
chrome.native.torrent.exe(browser torrent manager)
MD5: 85871a540cd77a3c419658fbe21c682b
Figure 1: Project Maelstrom Default Start Page These executables are copies of files stored in the
<version> subfolder and any update will cause
the current files to be renamed with a prefix of
The Project Maelstrom installation process is in “old_” before being replaced with the later version.
two parts. The first part installs a browser based on The file VisualElementsManifest.xml contains ref-
Chrome that acts as a user interface for the Mael- erences to logos and other application images that
strom network. The second is the BitTorrent Mael- include the <version> number to indicate which is
strom application that handles caching of torrents the current active subfolder though unless manually
and maintains connection to the distributed hash ta- altered this will usually indicate the latest version in-
ble (dht). stalled.
Once executed, the Project Maelstrom installer The subfolder \Application\<version>\webui
extracts the required installation files to a TEMP direc- contains the base files for displaying default
tory in \Users \<username> \Appdata. Chrome.7z Chromium fonts, backgrounds, borders as well as
and setup.exe are stored in a directory created all the files necessary for the start page and the
at \Local\Maelstrom\Application\<version index.html and images for the Project Maelstrom
number>\Installer. “onboarding” homepage partly depicted in figure1
All pre-defined URLs used during installation In addition to the Application folder
are stored within settings files extracted from the AppData\Local\Maelstrom\User Data\ is provi-
Chrome.7z package and IP addresses are resolved as sioned to store user specific data including user pref-
required through standard DNS queries. This allows erences, browser and torrent settings and browsing
the source servers to change addresses without in- history. The majority of files are stored in the
validating older installation packages. The Project Default folder which contains a subset of the stan-
Maelstrom specific URLs at the time of writing are: dard Chrome installation files in addition to files spe-
https://s3-us-west-1.amazonaws.com, cific to Project Maelstrom. These will be covered in
router.bittorrent.com, section 3.3 with the other files of forensic value to an
update.browser.bittorrent.com, investigator.
router.utorrent.com, The second part of the installation involves the
bench.utorrent.com, deployment of the BitTorrent Maelstrom folder
tracker.openbittorrent.com, in \Users \<username> \Appdata\Roaming\ to
tracker.publicbt.com, and handle all torrent and DHT related activities. Any
https://s3-us-east-1-elb.amazonaws.com. new torrents requested are saved here along with the
The first URL contacted is router.bittorrent.com corresponding .resume file. Torrent content is stored
which acts as a registration point for the client in the cache folder to speed up future access to to
and a way to initiate DHT participation. act as a repository to be shared with other users.
update.browser.bittorrent.com runs a script On installation the root folder is used to store the files
on connection which checks the version of the client 8E65684D700ECC41A09A60EE58991845EA56F734.resume
connecting and generates a redirect to download 8E65684D700ECC41A09A60EE58991845EA56F734.torrent
the relevant update if applicable. The update which are the torrent files associated with the Project

3
LeftChandlor000035Bio10/11 Center Right
Left Center Right

Maelstrom Startpage. the Chromium browser, many of the same forensi-


cally important resources are present. However, the
3.2 Settings dual nature of the utility does present some degree of
Being built on Chromium, Project Maelstrom has difference.For example, while the History tab shows
all of the same browsing options available to the the bittorrent:// and magnet:// sites visited, the
user. A standard installation has the same browser content for these sites is not stored in the standard
defaults set as a standard Chrome installation. caches or data files. Instead checking the output of
One setting of interest is found in Settings > about:cache in the browser will only show standard
Advanced Settings > System. The option to “con- website files. All caching and management relating
tinue running background apps when Maelstrom is to torrents are stored in the Roaming\BitTorrent
closed” is enabled by default. This will result Maelstrom folder.
in the chrome.native.torrent.exe executable run- • Local\Maelstrom\Application – This direc-
ning and sharing cached website torrent data from tory contains initial setup files and content added
the users system without any visible indicator. In during installation. Of most use is confirmation
testing the data rate was observed to maintain at an of an active installation, timestamps to indicate
average 800 bytes per second up and 800 bytes per install time and a record of the initial settings.
second down. This was most likely just DHT up- • Local\Maelstrom\User Data – contains the
dates and ping traffic. At any given time there were majority of the user specific settings and activity
approximately 50 concurrent connections recorded. records. This folder and its subfolders can be ex-
The Project Maelstrom specific settings can be amined using established forensic techniques for
found in the “Torrent” section of the settings page. use on Chrome browsers but also includes some
Each settings is customisable but at the time of in- new files that may be of use.
stallation the defaults are: 1. The history SQLite3 database file stored
• Cache Size – The amount of space available to in the default folder will contain both
store cached torrent data is set to 5GB with a the magnet:? URI and the resolved
warning not to exceed 100GB. bittorrent:// address. The magnet:?
• Sharing Ratio – A slider can be used to change entry will have a title (same as the torrent
the ratio from 0 upload to unlimited upload. The file), while the bittorrent:// address will
default setting is upload data is equal to the have the title of the index.html page con-
amount downloaded. This sharing ratio is mea- tained in the bundle.
sured on a per torrent basis and is presented to 2. Origin Bound Certs will only contain the
the user as a measure of contribution. certificates specific to the normal internet
• Rate Limits – The default setting is to not re- browsing activities.
strict upload or download speeds. 3. The HTML rendered after scripts have
• Transfer Limit – The default setting is not to been run is stored in the Cache direc-
impose a cap on total transfer amounts. tory. for example, the full HTML of
• Port Settings – A port is selected at installa- bittorrent://welcome is stored as part of
tion as the static communication port. This port f_000005.
number can be customised here or can be set to 4. Local Storage – contains persistent infor-
be randomly selected. mation used by visited websites. From
• Proxy Server – The proxy settings for torrent the first run this will contain the Unique
transfers can be selected from a dropdown list ID used by bench.utorrent.com to gather
including SOCKS which will allow for the use of anonymous usage statistics. The contents
TOR for added anonymity. By default, no proxy of this folder are SQLite3 databases with
is selected. individual data sets stored as blobs
• Privacy – By default the option to send anony- 5. Session Storage – contains a list of Bit-
mous crash data to BitTorrent Inc. is enabled. Torrent UUIDs. The first UUID matches
that stored in the bench.utorrent local
3.3 Files of Forensic value storage file. The remainder are those of re-
During the first run the settings and preferences files mote peers the browser is in contact with.
stored in User Data and Application are popu- • Roaming\BitTorrent Maelstrom\ Contains the
lated. at the same time, the torrent specific set- bulk of the torrent handling and storage elements
tings and discovery are recorded in the Roaming di- of Project Maelstrom. The root folder has many
rectory. Because of Project Maelstrom’s roots in items of forensic value including:

4
LeftChandlor000036Bio10/11 Center Right
Left Center Right

1. <bt.infohash>.torrent – The torrent file in either scenario the AppData\Roaming\BitTorrent


used to locate and download resources from Maelstrom\ folder is left untouched along with all of
peers. This is a standard torrent file for- the subfolders and files contained therein including
mat listing all sub-files contained within the dht.dat and any .torrent and .resume files that
“bundle” and the hashes for each piece and may have been stored. Additionally all cached data
the trackers. held in the cache directory is left intact and retriev-
2. <bt.infohash>.resume – A resume file cre- able.
ated to allow .torrent data processing to
be paused and continued without having to 4. PROJECT MAELSTROM:
restart to ensure full data transfer. The cre-
ation of the resume file could be used as a PROTOCOL ANALYSIS
guideline to when the related .torrent was
first processed.
4.1 Accessing a Website
3. dht.dat – This consists of a bencoded list of Project Maelstrom supports all of the regular pro-
observed IP:Port combinations of peers par- tocols that ship with Chromium, alongside two Bit-
ticipating in the DHT. The content starts Torrent ecosystem specific URIs; bittorrent:// and
with the value id20:, which is followed by magnet:?.
the BitTorrent client’s unique ID on the In order to load any given webpage, the browser
DHT network. It is then followed by the must:
number of observed nodes or peers and a
listing of the IP:Port pairs in 6 byte IPv4 1. Resolve the magnet link to a .torrent file.
representations. 2. Identify other peers currently serving that web-
4. settings.dat – This file contains the tor- page.
rent client settings and the usage statistics. 3. Identify any torrents contained within the web-
5. Cache – This stores the downloaded torrent page bundle.
files, which correspond to the name of the 4. Cache part files and store them as blocks of the
torrent appended with the btinfohash of whole in sequence.
the piece. Scripts and web pages are loaded 5. Process any scripts that determine content.
and executed from this directory to improve 6. Update the sharing status of the local content to
browsing speed for the user and to provide include the new webpage.
a repository to share to other peers. 4.1.1 Peer Discovery
6. trusted folder – This stores the
bittorrent.crt file for use in peer In order to discover an initial set of peers the client
secure transfer negotiation. contacts router.bittorrent.com. The packet re-
questing peers contains the command “find_nodes”
3.4 Local File Remnants and is bencoded with the Local Peer ID followed
by the BitTorrent share ID. In response a ben-
While there is no dedicated uninstall executable for coded list of registered peers is returned, each with
Project Maelstrom, the registry key a 26 byte entry consisting of a 20 byte PeerID
HKCU\<user>\Software\Microsoft\Windows\ and a 6 byte IP:Port entry. The local client then
CurrentVersion\Uninstall\Maelstrom uses DHT_Ping to discover live hosts and stan-
UninstallString dard torrent discovery is performed as described in
shows the control panel remove program options to the BitTorrent Extension Protocols (BEP) – specifi-
be the equivalent of running cally BEP005 deals with DHT discovery and boot-
“AppData\Local\Maelstrom\Application\ strapping [Loewenstern and Norberg, 2008]. Any
<version> \Installer\setup.exe” peer that has the chrome.native.torrent.exe ex-
−−uninstall ecutable running, even if the Maelstrom browser is
By default the option to remove user history is not shutdown, that has the torrent in its cache will be-
checked. If left as default, all of the files and folders come available as valid content source.
in AppData\Local\Maelstrom\User Data\ is left
intact including any history and cookie folders as 4.1.2 Content Negotiation
well as user specific settings and preferences. Links in Project Maelstrom are handled either as tor-
If the option to remove User history is se- rent files or as magnet:? links. If a magnet link
lected then the entire AppData\Local\Maelstrom\ is followed the Maelstrom browser must first resolve
folder and all subfolders are removed. However, the data provided in the link to a valid torrent file.

5
LeftChandlor000037Bio10/11 Center Right
Left Center Right

This is achieved either by looking up the tracker URL REFERENCES


included in the magnet link using the tr parame-
ter or one of the default trackers defined at installa- Alex Biryukov, Ivan Pustogarov, and Ralf-Philipp
tion. Once discovered the torrent ID is then used to Weinmann. Trawling for tor hidden services: De-
construct find_nodes requests that are sent to the tection, measurement, deanonymization. In Pro-
known DHT peers and router.bittorrent.com. ceedings of the 2013 IEEE Symposium on Security
and Privacy, pages 80–94, 2013.
4.2 Maelstrom Development
BitTorrent Inc. Torrent Web Tools. https://
The development of websites for distribution through github.com/bittorrent/torrent-web-tools,
Maelstrom does not require any special consideration May 2015.
during the development of the website content, i.e.,
the HTML, CSS, JavaScript, graphics, multimedia David Y Choi and Arturo Perez. Online piracy, in-
content, etc. Where the traditional web development novation, and legitimate business models. Techno-
process differs is with regards to the uploading of con- vation, 27(4):168–178, 2007.
tent to a web server. These websites can be delivered Eric Klinker. Project Maelstrom: The Internet
either as a single large torrent resource broken into We Build Next. http://blog.bittorrent.com/
pieces or, for ease of maintenance and to avoid users 2014/12/10/project-maelstrom-the-internet
having to re-download the entire site every time the -we-build-next/, December 2014.
content is changed, as a series of .torrent files linked
by an encompassing “Bundle”. Eric Klinker. Project Maelstrom Enters Beta.
To make the process more streamlined, BitTorrent http://blog.bittorrent.com/2015/04/10/
Inc. have developed a number of helper tools and project-maelstrom-enters-beta/, April 2015.
guidelines [BitTorrent Inc., 2015]. Andrew Loewenstern and Arvid Norberg. DHT
Protocol.
5. CONCLUSION http://www.bittorrent.org/beps/bep_0005
.html, 2008. [Online; accessed July 2015].
It is envisioned that the popularity of Project Mael-
strom (and similar decentralised alternatives) will in- Stefanie Roos, Benjamin Schiller, Stefan Hacker, and
crease in the coming months and years due to its ease Thorsten Strufe. Measuring freenet in the wild:
of use, easy web-authoring and zero-cost model (to Censorship-resilience under observation. In Privacy
content creators and consumers alike). Perhaps Bit- Enhancing Technologies, pages 263–282. Springer,
Torrent Inc. will have the ability to raise their new 2014.
Internet from the deep web into the hands of regular Mark Scanlon, Jason Farina, Nhien-An Le Khac, and
Internet users. M-Tahar Kechadi. Leveraging Decentralisation to
Extend the Digital Evidence Acquisition Window:
5.1 Future Work Case Study on BitTorrent Sync. Journal of Digital
With the browser just having been recently released Forensics, Security and Law, pages 85–99, 2014.
as a public beta, there remains much to be discov-
Mark Scanlon, Jason Farina, and M-Tahar Kechadi.
ered regarding the nuances of this decentralised web.
Network Investigation Methodology for BitTorrent
Some areas for future work include:
Sync: A Peer-to-Peer Based File Synchronisation
• Development of a tool capable of monitoring who Service. Computers & Security, 2015. http://dx
is accessing any specific Maelstrom-only website. .doi.org/10.1016/j.cose.2015.05.003.
The decentralised nature of the protocol leaves it
vulnerable for third-parties to garner visitor in- Paul A. Watters, Robert Layton, and Richard Daze-
formation statistics that are normally reserved ley. How much material on BitTorrent is infringing
for the administrator of the web host such as content? A case study. Information Security Tech-
location, access duration, files downloaded, fre- nical Report, 16(2):79 – 87, 2011.
quency of repeat visits, content available for Bassam Zantout and Ramzi Haraty. I2p data com-
sharing, etc. munication system. In ICN 2011, The Tenth Inter-
• Perform an analysis of ZeroNet [ZeroNet, 2015], national Conference on Networks, pages 401–409,
an open source Project Maelstrom alternative. 2011.
ZeroNet does not require the use of a specific
browser – instead running as a service on the ZeroNet. https://github.com/HelloZeroNet/
local machine. ZeroNet, May 2015.

6
LeftChandlor000038Bio10/11 Center Right

You might also like