Parasitic Computing

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 13

1.

INTRODUCTION

Parasitic computing is programming technique where a program in normal authorized


interactions with another program manages to get the other program to perform computations of
a complex nature. It is, in a sense, a security exploit in that the program implementing the
parasitic computing has no authority to consume resources made available to the other program.

The example given by the original paper was two computers communicating over the
Internet, under disguise of a standard communications session. The first computer is attempting
to solve a large and extremely difficult 3-SAT problem; it has decomposed the original 3-SAT
problem in a considerable number of smaller problems. Each of these smaller problems is then
encoded as a relation between a checksum and a packet such that whether the checksum is
accurate or not is also the answer to that smaller problem. The packet/checksum is then sent to
another computer. This computer will, as part of receiving the packet and deciding whether it is
valid and well-formed, create a checksum of the packet and see whether it is identical to the
provided checksum. If the checksum is invalid, it will then request a new packet from the
original computer. The original computer now knows the answer to that smaller problem based
on the second computer's response, and can transmit a fresh packet embodying a different sub-
problem. Eventually, all the sub-problems will be answered and the final answer easily
calculated. So in the end, the target computer(s) is unaware that it has performed computation for
the benefit of the other computer, or even done anything besides have a normal TCP/IP session.

The proof-of-concept is obviously extremely inefficient as the amount of computation


necessary to merely send the packets in the first place easily exceeds the computations leached
from the other program; and the 3-SAT problem would be solved much more quickly if just
analyzed locally. In addition, in practice packets would probably have to be retransmitted
occasionally when real checksum errors and network problems occur. However, parasitic
computing on the level of checksums is a demonstration of the concept. The authors suggest that
as one moves up the application stack, there might come a point where there is a net
computational gain to the parasite - perhaps one could break down interesting problems into
queries of complex cryptographic protocols using public keys. If there was a net gain, one could

1
in theory use a number of control nodes for which many hosts on the Internet form a distributed
computing network completely unawares.

Parasitic computing has drawn a lot of attention in the Internet community since the first
demonstration of its potential. A search of the term “parasitic computing” on google returns 544
results, with coverage on several major portal website such as CNN and BBC. Much attention
was draw to the concept of parasitic computing because it is capable of making use of
computational resources from all the computers connected to the Internet and comply to the
transmission control protocol (TCP). Since most, if not all, of the computers on the Internet need
to use the standard TCP protocol to communicate, the resource available to a parasiting user
appears unlimited and every server could be exploited. This imposes a serious problems for the
servers on the Internet, as it can possibly allocate all the computational capability to the
parasiting users and thus affect the access effort from a normal client similar to that in the denial
of access attack. As the access of normal user is vital to the existence of a commercial website, it
is important, if not crucial, to design effective mechanism in the communication protocol stack to
identify the parasitic users among the many normal users. Moreover, the lack of detection tools
makes ordinary Internet users stressed and uneasy.

Our implementation of parasitic computing is not efficient. If it is made efficient, it could


offer unlimited computational power.

THE NET is a fertile place where new ideas/products surface quite often. We have
already come across many innovative ideas such as Peer-to-Peer file sharing, distributed
computing and the like. Parasitic computing, which harnesses the computing power of machines
that spread across the Net to accomplish complex computing tasks, is new in this category.

When we send a request to a web server for a web page, the communication programs in
our machine split the request into information packets before sending it across the Net. When
these packets reach the target machine, they go through certain communication layers before
reaching the target program.

The web server that serves the requested page. One such layer is the TCP -- Transmission
Control Protocol -- that assembles the packets in proper order and makes sure that all the packets

2
are in proper shape before handing them over to the web server. At this stage the TCP
component makes some computation to ascertain the validity of the received information
packets. As reported in the magazine Nature (nature.com/nature/fow/010830.html), a few
scientists exploit this aspect of the TCP, known as checksum, to deploy the computing power of
various servers (without any permission) to make some computations and thereby convert the
Net into a giant distributed computer in which the servers perform computation on behalf of a
remote node.

The scientists split their computational problem into small jobs (joblets) that can be
evaluated independently and sent each joblets to different servers spread across the Net and got a
mathematical problem solved this way. This is similar to the famous distributed computing
model, which uses the idle computing time of the machines spread across the globe. (A few
months ago Net Speak featured this exciting computing model --refer The Hindu dated August
10, 2000) As the experimenters enlisted the service of these servers without the knowledge or
permission of server administrators, they coined the term 'parasitic computing' to describe this
innovative computing model. If you want more details on this subject, access the link at:
nd.edu/parasite. Though using the web this way to tap the computing resources spread across the
Net is exciting to innovative minds, it raises certain ethical and legal issues as we are
encroaching others' resources without even informing them.

In order to fully understand the concept of parasitic computing, it is important to


understand how communication over internet via TCP/IP works. To demonstrate this, consider a
scenario where a user is trying to visit a website. When user informs a browser the website URL
(uniform resource locator), the browser opens a transmission control protocol (TCP) connection
and connects to the web server. After establishing this connection, browser issues a hyper‐text
transmission protocol (HTTP) request via already opened TCP connection. This TCP message is
then carried to the destination (web server) via internet protocol (IP). In this process of
transmitting message from source (user) to destination, IP might break entire message into
several pieces commonly addressed as TCP packets. These packets are then transmitted to the
destination IP address via different routes. Once the destination receives all packets, a response is
returned to the source via the same TCP channel. The original message is then reassembled via
consecutive steps involving TCP and IP and is interpreted as HTTP request. After that, the web

3
server sends a response (webpage HTML) back to the user (CISCO). Thus, even such a simple
communication over internet requires significant amount of computation at all network stages
and only cooperation and trust between all involved parties can guarantee a successful
communication over internet.
In parasitic computing, this trust based relationship of machines connected to the network
is exploited to make other machines perform a certain mathematical operations on certain data
without an authorization. Albert‐László, Vincent, Hawoong and Jay used a parasitic computer to
solve the well known NP‐complete satisfiability problem, by engaging various web servers
physically located in North America, Europe, and Asia, each of which unknowingly participated
in the experiment. Like SETI@home project, parasitic computing decomposes a problem into
several small problems which are mutually exclusive and can be solved independently via
machines connected to the network. But unlike SETI@home project, the Parasitic Computing
participating machines perform these calculations without knowing about them.

2. DETAILS OF PARASITIC COMPUTING


4
Parasitic computing is a concept by which one can use the resources of machines that are
connected on the Internet. This technology exploits open Internet protocols to use the resources
of remote machines. As the name suggests, the machine that requires the services of others does
not need to be authorized by the latter. Any machine, which is connected to the Internet, has to
carry out minimum processing of any packet they receive without any authorization. This
concept is exploited by parasitic computing in order to make use of computing powers of remote
machines and web servers all around the globe. So one cannot really stop their machines from
being utilized in this manner.
Parasitic computing can be a very effective technique when it comes to solve NP
complete problems such as Circuit SAT, 3 SAT, etc. These problems are currently considered as
some of world’s most complex and time consuming problems. These problems generally have a
set of solutions which itself is a subset of a set of possible solutions.

2.1 How it differs from cluster computing?


Parasitic computing differs from cluster computing. Cluster computing is an idea in
which many computers pool in their resources willingly to create a cumulative power equivalent
to that of a supercomputer capable of solving complex computational problems. In contrast,
parasitic computing does not require the willingness of any target machine to participate in the
problem solving. This way one can make use of many distributed resources across the Internet,
which is otherwise remaining idle. Also if one divides the task in such a manner that no
computer is overloaded remote machine performance won’t deteriorate much and our task will
be also be accomplished.

2.2 It is not cracking


Parasitic computing though works without authorization it is entirely different from the
concept of cracking. In cracking data is sent to the remote computer with malicious intentions
and in order to corrupt some resource of the remote machine whereas in this case we are utilizing
those resources for a constructive purpose and also accessing only those parts of remote
machines, which are made open on the Internet and that is done without any malicious intentions
but it can cause delay of services in the remote machines as resources are being utilized without
the knowledge of the owner.

5
3. REVIEW OF PARASITIC COMPUTING

6
Reliable communication on the Internet is guaranteed by a standard set of protocols, used
by all computers. Here we show that these protocols can be exploited to compute with the
communication infrastructure, transforming the Internet into a distributed computer in which
servers unwittingly perform computation on behalf of a remote node. In this model, which we
call `parasitic computing', one machine forces target computers to solve a piece of a complex
computational problem merely by engaging them in standard communication. Consequently, the
target computers are unaware that they have performed computation for the benefit of a
commanding node. As experimental evidence of the principle of parasitic computing, we harness
the power of several web servers across the globe, which is unknown to them work together to
solve an NP complete problem. Sending a message through the Internet is a sophisticated process
regulated by layers of complex protocols.

When a user selects a URL (uniform resource locator), requesting a web page, the
browser opens a transmission control protocol (TCP) connection to a web server. It then issues a
hyper-text transmission protocol (HTTP) request over the TCP connection. The TCP message is
carried via the Internet protocol (IP), which might break the message into several packages, that
navigate independently through numerous routers between source and destination. When an
HTTP request reaches its target web server, a response is returned via the same TCP connection
to the user's browser. The original message is reconstructed through a series of consecutive steps,
involving IP and TCP; it is finally interpreted at the HTTP level, eliciting the appropriate
response (such as sending the requested web page)1. Thus, even a seemingly simple request for a
web page involves a significant amount of computation in the network and at the computers at
the end points. The success of the Internet rests on the cooperation of and trust between all
involved parties.

Parasitic computing involves participation of network elements (computers) in a


distributed computation without their explicit knowledge. Distributed computing (e.g. .SETI)
also involves computing across the network, but in this case the nodes participate willingly. Also
parasitic computing may interfere with the normal working of the computer, in the case of HTTP
parasitic computing can be regarded as a denial of service (DOS) attack.

7
Reliable transmission of data in computer networks is done by Transmission control
Protocol (TCP). TCP provides a reliable bit pipe between the two end hosts. Reliability in TCP is
achieved by a combination of basic error detection (checksum), retransmission and
acknowledgments. Below the TCP layer is the Internet Protocol (IP) layer which takes care of
routing. TCP layer of the receiving host acknowledges every correct packet. It also sends a
negative acknowledgment for every wrong packet it receives. Parasitic computing uses the
checksum and acknowledgment to verify if a randomly generated value is a solution of a
mathematical problem. The parasitic node computes the checksum in such a way that only a
valid solution results in a checksum match.

Any computer with application layer services like HTTP, FTP, SSH, SMTP can be used
as targets for parasitic computing. The following steps are required to perform a parasitic
computation.
• Send a handcrafted packet to the server.
• Wait for some time to receive a acknowledgment, otherwise time out.
• Either terminate the connection, or send one more modified packet with appropriate
checksum.

Also one would observe that all the packets would be of equal importance, if all the solutions
are equally likely. Otherwise it would be more appropriate to test the vectors which have a
higher probability of success followed by the other.

The above mentioned parasitic computing is in the TCP layer of the TCP/IP stack. One can
also use the application layer to perform parasitic computing. This can be thought of as covert
computing. For example one can use the concept of frames and embedded Java script to perform
the computation. When a user loads a web page, the web page is divided into two frames, one
frame being a pixel high. This frame will have an embedded Java script that uses the host
processor for computation and sends the results back to the server. This form of parasitic
computing requires the initial cooperation of the user, i.e the user has to visit the web site. Even
IP layer has a 16 bit checksum. This checksum can also be used for computation. Performing
parasitic computing at the IP layer leads to packet being dropped at the first router. So one would

8
be restricted by the first router and would travel no further than that. This would lead to packet
dropping at the router and subsequent network congestion.

4. IMPLEMENTATION USING TCP

9
Sending a message over an internet is a very sophisticated process as the message is processed
across many layers from HTTP then to TCP then to IP layer, going through data link layer finally
to the physical layer and in the same manner the message is constructed back at the destination.
To implement this concept of parasitic computing we can choose to exploit processing
theoretically any of these layers but below TCP layer it is not very beneficial.

Till now there has been only one implementation, which has exploited this concept of
parasitic computing. Idea is to use some feature of the protocol in such a manner that remote
machines respond to the request unknowingly that they are involved in solving a complex
problem and they believe that they are responding to a simple application request over TCP
connection.

The main target problems for such distributed environments are NP-complete problems
i.e. non-deterministic polynomial problems. These problems are such that their steps cannot be
expressed in terms of polynomial time and therefore to know the right solutions one has to
evaluate many possible alternatives. The property, which can be exploited here, is that all the
alternative solutions can be evaluated in parallel and therefore different machines across the
Internet can be used simultaneously for evaluation thousands of possible candidate solutions for
any such problem. Like in this case the protocol, which is being used for this purpose, is TCP. To
understand the implementation one first needs to have a brief idea of TCP checksum.

4.1 TCP Checksum

10
The checksum field is the 16 bit one's complement of the one's complement sum of all
16-bit words in the header and text. If a segment contains an odd number of header and text
octets to be check summed, the last octet is padded on the right with zeros to form a 16-bit word
for checksum purposes. The pad is not transmitted as part of the segment. While computing the
checksum, the checksum field itself is replaced with zeros. This information is carried in the
Internet Protocol and is transferred across the TCP/Network interface in the arguments or results
of calls by the TCP on the IP.

11
5. HOW PARASITIC COMPUTING WORKS

Although any possible solution to such problems can be verified quickly, there is no
known efficient way to identify a solution in the first place. In fact, the most notable
characteristic for such problem is that there is no fast solution. The time required to solve such
problem is exponentially proportional to the size of the problem. So, as the size of the problem
grows, the time required to find all solutions of the problem grows exponentially. In fact, time
required to solve a moderately large NP‐Complete problem can easily reach billions if not
trillions of years using any kind of modern computing technology we have available today. For
this reason, even just determining whether there is a fast solution to such problems or not is one

of the principal unsolved problems of computer science. The parasitic computer starts the
process by transmitting specially generated messages to number of targeted web servers
consisting of arithmetic and logic unit (ALU) and a network interface (NIF). The packet carrying

12
one of possible solutions to the problem is inserted into the IP level bypassing the parasitic
node’s TCP.

The parasitic computer generates a message in such a way that if the solution is not valid,
it will fail the TCP checksum on the destination machine and the packet will be dropped. But in
the case when the solution is correct, it will be propagated to the HTTP layer via TCP. Since it is
a behavior of a web server to respond to any requests coming to an HTTP layer regardless of
whether it understands the request or not, the web server will send a response back to the
parasitic computer that it has received an HTTP request. Thus the parasitic computer sends out a
message for each possible solution it only receives responses back from the server when the
possible solution is a one of the actual solutions of the problem.

Parasitic computing is programming technique where a program in normal authorized


interactions with another program manages to get the other program to perform computations of
a complex nature. It is, in a sense, a security exploit in that the program implementing the
parasitic computing has no authority to consume resources made available to the other program.
Although as elegant and effective it proves to be, there are some major problems with this
approach for computing. Since most of the computers connected to the network will be using
TCP/IP, the resources available to the parasitic computer are virtually unlimited and almost all of
the computer can be exploited. Furthermore, there is a very high possibility that servers can
allocate their valuable CPU cycles to do the processing commanded by the parasitic node thus
degrading overall performance of the applications running on the server and access efforts of the
normal application.

13

You might also like