SCTP An Innovative Transport Layer Protocol For TH

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/221023930

SCTP: an innovative transport layer protocol for the web

Conference Paper · January 2006


DOI: 10.1145/1135777.1135867 · Source: DBLP

CITATIONS READS
49 385

4 authors, including:

Preethi Natarajan Randall Stewart

22 PUBLICATIONS   1,283 CITATIONS   
Adara Networks
51 PUBLICATIONS   4,688 CITATIONS   
SEE PROFILE
SEE PROFILE

Some of the authors of this publication are also working on these related projects:

SCTP (Stream Control Transmission Protocol) View project

All content following this page was uploaded by Randall Stewart on 31 March 2014.

The user has requested enhancement of the downloaded file.


SCTP: An innovative transport layer protocol for the web
Preethi Natarajan1, Janardhan R. Iyengar1, Paul. D. Amer1 and Randall Stewart2
1 2
Protocol Engineering Lab, CIS Dept Internet Technologies Division
University of Delaware Cisco Systems
{nataraja, iyengar, amer}@cis.udel.edu [email protected]

ABSTRACT standardized reliable transport protocol which provides a set of


We propose using the Stream Control Transmission Protocol innovative transport layer services unavailable from TCP (or
(SCTP), a recent IETF transport layer protocol, for reliable web UDP). In this paper, we argue that these services can enhance web
transport. Although TCP has traditionally been used, we argue transfers, making SCTP a better choice for web transport.
that SCTP better matches the needs of HTTP-based network SCTP was originally designed within the IETF SIGTRAN
applications. This position paper discusses SCTP features that working group to address the shortcomings of TCP for telephony
address: (i) head-of-line blocking within a single TCP signaling over IP networks [2]. SCTP has since evolved into a
connection, (ii) vulnerability to network failures, and (iii) general purpose IETF transport protocol, and is well beyond a
vulnerability to denial-of-service SYN attacks. We discuss our laboratory research project. More than 25 SCTP implementations
experience in modifying the Apache server and the Firefox currently exist, including kernel implementations for FreeBSD,
browser to benefit from SCTP, and demonstrate our HTTP over NetBSD, OpenBSD, Mac OS X, Linux, Solaris, AIX, and HP-
SCTP design via simple experiments. We also discuss the benefits UX; and user-space implementations for Windows, on proprietary
of using SCTP in other web domains through two example platforms for Cisco, Nokia, Siemens, and other vendors. Eight
scenarios ─ multiplexing user requests, and multiplexing resource interoperability workshops over the past five years have fine-
access. Finally, we highlight several SCTP features that will be tuned these implementations [14].
valuable to the design and implementation of current HTTP-based
client-server applications. Of SCTP’s new services and features, SCTP multistreaming
provides an application with logically separate data streams to
transfer multiple independent objects, SCTP multihoming
Categories and Subject Descriptors provides transparent fault-tolerance to applications on
C.2.5 [Computer-Communication Networks]: Local and Wide- multihomed end hosts, and SCTP’s four-way handshake during
Area Networks – Internet; C.2.6 [Computer-Communication association (SCTP’s term for a connection) establishment avoids
Networks]: Internetworking – Standards; C.4 [Performance of denial-of-service SYN attacks. In this paper, we discuss these
Systems]: Design Studies; Fault Tolerance; Reliability, features and their applicability to web transfers.
availability and serviceability.
The paper is organized as follows. Section 2 details how SCTP
solves three specific limitations that occur when HTTP-based
General Terms client-server applications use TCP: head-of-line blocking,
Performance, Design, Security. disruption due to network failures, and SYN attacks. Section 3
overviews our modifications to Apache and Firefox architectures
Keywords to operate over SCTP. We also analyze how their original
SCTP, Stream Control Transmission Protocol, fault-tolerance, architectures limit full utilization of SCTP’s new features. In
head-of-line blocking, transport layer service, web applications, Section 4, we explore web domains other than general browsing,
web transport. and articulate how these domains can benefit from SCTP. Section
5 elaborates other SCTP features and relevant SCTP work that
1. INTRODUCTION might be useful for HTTP-based network applications. Section 6
HTTP requires a reliable transport protocol for end-to-end summarizes and concludes the paper.
communication. While historically TCP has been used for this
purpose, RFC2616 does not require TCP; but until now, no 2. HTTP OVER TCP CONCERNS
reasonable alternative existed. The Stream Control Transmission In this section, we discuss three major concerns in using TCP for
Protocol (SCTP), specified in RFC2960, is a recently web transport, and how our choice ─ SCTP ─ effectively
addresses all of these concerns.
• Prepared through collaborative participation in the Communication and
Networks Consortium sponsored by the US Army Research Lab under 2.1 Head of line blocking
Collaborative Tech Alliance Program, Coop Agreement DAAD19-01-2- Consider the simple case of a web browser displaying a web page.
0011. The US Gov’t is authorized to reproduce and distribute reprints Using HTTP/1.1 that supports persistent and pipelined
for Gov’t purposes notwithstanding any copyright notation thereon. connections, the browser opens a new transport connection to the
• Supported by the University Research Program, Cisco Systems, Inc.
server, and sends an HTTP GET request with the desired URI.
Copyright is held by the World Wide Web Conference Committee
The server returns an HTTP response with the page contents. This
(IW3C2). Distribution of these papers is limited to classroom use, and page may contain URIs of embedded objects. The browser parses
personal use by others. the content for these URIs, and sends pipelined HTTP GET
WWW 2006, May 23–26, 2006, Edinburgh, Scotland. requests for each of the URIs. As responses arrive from the server,
ACM 1-59593-323-9/06/0005. the browser displays the webpage with its embedded objects.

1
In general, objects embedded within a web page are independent consumption of an unfair share of the bottleneck bandwidth
of each other. That is, requesting and displaying each object in the as compared to applications using fewer connections.
page does not depend on the reception of other embedded objects.
• Absence of integrated loss detection and recovery: Web
This “degree of freedom” is best exploited by concurrently
objects are typically small, resulting in just a few TPDUs per
downloading and rendering the independent embedded objects.
HTTP response. In these cases, a TPDU loss is often
At the transport layer, TCP offers a single sequential bytestream recoverable only through an expensive timeout at the web
to an application; all application data are serialized and sent server due to an insufficient number of duplicate acks to
sequentially over the single bytestream. In addition, TCP provides trigger a fast retransmit [8]. Though this problem is lessened
in-order delivery within this bytestream ─ if a transport protocol in HTTP/1.1 due to persistent connections and pipelined
data unit (TPDU) is lost in the network, successive TPDUs requests, it still exists while using multiple TCP connections
arriving at the TCP receiver will not be delivered to the since separate connections cannot share ack information for
application until the lost TPDU is retransmitted and received. loss recovery.
Hence, when TCP is used for web transport, a lost TPDU carrying
• Increased load on web server: The web server has to allocate
a part of a web object may block delivery of other successfully
and update a Transmission Control Block (TCB) for every
received independent web objects. This problem, known as head-
TCP connection. Use of parallel TCP connections between
of-line (HOL) blocking, is due to the fact that TCP cannot
client and server increases TCB processing load on the
logically separate independent application level objects in its
server. Under high loads, some web servers may choose to
transport and delivery mechanisms.
drop incoming TCP connection requests due to lack of
HOL blocking also results in unnecessary filling of the receiver’s available memory resources.
transport layer buffer space. Reliable transport protocols such as
• Increased connection establishment latency: Each TCP
TCP use a receiver buffer to store TPDUs that arrive out-of-
connection goes through a three-way handshake for
order. Once missing TPDUs are successfully retransmitted, data in
connection establishment before data transfer is possible.
the receiver buffer is ordered and delivered to the application.
This handshake wastes one round trip for every connection
This buffer fill up is unnecessary in cases when ‘later received’
opened to the same web server. Any loss during connection
TPDUs belong to a different application object than the earlier
setup can be expensive since a timeout is the only means of
lost TPDU(s). The required amount of buffer space increases
loss detection and recovery during this phase. Increasing the
with the loss probability in the transmission path, and the number
number of connections increases the chances of losses during
of independent objects to be transferred.
connection establishment, thereby increasing the overall
Note that HOL blocking is particularly exacerbated in domains average transfer time.
with low bandwidth and/or high loss rates. With the proliferation
Congestion Manager (CM) [6] attempts to solve the first two
of mobile phones, and the increasing use of web browsers and
problems. CM is a shim layer between the transport and network
other web applications on mobile phones, increased HOL
layers which aggregates congestion control at the end host,
blocking will cause significant user-perceived delays.
thereby enforcing a fair sending rate when an application uses
To alleviate HOL blocking, web browsers usually open multiple multiple TCP connections to the same end host. “TCP Session”
TCP connections to the same web server [5]. All HTTP GET [7] proposes integrated loss recovery across multiple TCP
requests to the server are distributed among these connections, connections to the same web client (these multiple TCP
avoiding HOL blocking between the corresponding responses. connections are together referred to as a TCP session). All TCP
However, multiple independent objects transferred within one of connections within a session are assumed to share the
the several parallel connections still suffer from HOL blocking. transmission path to the web client. A Session Control Block
Using multiple TCP connections for transferring a single (SCB) is maintained at the sender to store information about the
application’s data introduces many negative consequences for shared path such as its congestion window and RTT estimate.
both the application and the network. Previous work such as Both, CM and TCP Session, still require a web browser to open
Congestion Manager [6] and Transaction TCP [17] analyze these multiple TCP connections to avoid HOL blocking, thereby
consequences in depth, which we summarize: increasing the web server’s load.
• Aggressive behavior during congestion: TCP’s algorithms Apart from solving the network related problems due to parallel
maintain fairness among TCP (and TCP-like) connections. A TCP connections, there has also been significant interest in
TCP sender reduces its congestion window by half when designing new transport and session protocols that better suit the
network congestion is detected [13]. This reduction is a well needs of HTTP-based client-server applications than TCP. Several
understood and recommended procedure for maintaining experts agree (for instance, see [28]) that the best transport
stability and fairness in the network [18,19]. An application scheme for HTTP would be one that supports datagrams, provides
using multiple TCP connections gets an unfair share of the TCP compatible congestion control on the entire datagram flow,
available bandwidth in the path, since all of the application’s and facilitates concurrency in GET requests. WebMUX [29] was
TCP connections may not suffer loss when there is one such session management protocol that was a product of the
congestion in the transmission path. If m of the n open TCP (now historic) HTTP-NG working group [30]. WebMUX
connections suffer loss, the multiplicative decrease factor for proposed use of a reliable transport protocol to provide web
the connection aggregate at the sender is (1 - m/2n) [8]. This transfers with “streams” for transmitting independent objects.
decrease factor is often greater than one-half, and therefore While the WebMUX effort did not mature, SCTP is a current
an application using parallel connections is considered an IETF standards-track protocol with several implementations and a
aggressive sender. This aggressive behavior leads to growing deployment base, and offers many of the core features
that were desired of WebMUX.

2
We propose using SCTP’s multistreaming feature ─ a previously The size of a TCP TCB is quite high (~700 bytes) when compared
unavailable transport layer service specifically designed to avoid to memory overhead for a pair of SCTP streams (32 bytes). Using
HOL blocking when transmitting logically independent these values, the memory requirements for the TCP and SCTP
application objects. An SCTP stream is a unidirectional data flow cases can be approximated as:
within an SCTP association. Independent application objects can For n parallel TCP connections
be transmitted in different streams to maintain their logical
separation during transfer and delivery. Note that an SCTP = [n * (TCP TCB size)] bytes
association is subject to congestion control similar to TCP. Hence, = [n * 700] bytes
all SCTP streams within an association are subject to shared
For 1 SCTP association with n pairs of streams
congestion control, and thus multistreaming does not violate
TCP’s fairness principles. = [(SCTP TCB size) + (n * 32)] bytes
= [(2 * TCP TCB size) + (n * 32)] bytes
= [1400 + (n * 32)] bytes
From the above calculations, it is evident that the memory
required for the TCP case increases rapidly with n (n > 2) when
compared to the SCTP case. Note that with SCTP multistreaming,
apart from the lower memory overhead, a web server also incurs
the lower processing load of only one TCB per web client.
We discuss a more detailed mapping of HTTP over SCTP, and
our implementation of this mapping in Section 3.

2.2 Network Failures


Critical web servers rely on redundancy at multiple levels to
provide uninterrupted service during resource failures. A host is
Figure 1. Multistreamed association between two hosts multihomed if it can be addressed by multiple IP addresses [4].
Multihoming a web server offers redundancy at the network layer,
provided that the web server remains accessible even when one of
Figure 1 illustrates a multistreamed association between hosts A
its IP addresses becomes unreachable, say due to an interface or
and B. In this example, host A uses three output streams to host B
link failure, severe congestion, or slow route convergence around
(numbered 0 to 2), and has only one input stream from host B
path outages.
(numbered 0). The number of input and output streams in an
SCTP association is negotiated during association setup. Multihoming end hosts is becoming increasingly economical. For
instance, today’s relatively inexpensive access to the Internet
SCTP uses stream sequence numbers (SSNs) to preserve data
motivates home users to have simultaneous wired and wireless
order within each stream. However, maintaining order of delivery
connectivity through multiple ISPs, thereby increasing the end
between TPDUs transmitted on different streams is not a
host’s fault tolerance at an economically feasible cost.
constraint. That is, data arriving in-order within an SCTP stream
is delivered to the application without regard to data arriving on TCP is ignorant of multihoming. Even if end hosts have multiple
other streams. interfaces, an application using TCP cannot leverage this network
layer redundancy, since TCP allows the application to bind to
To transfer independent web objects without HOL blocking, each
only one network address at each end of a connection. For
object can be sent in a separate stream, all within a single
example, in Figure 2, assume that host A runs a web server and
association. SCTP uses a single global Transmission Sequence
host B runs a web client. Using TCP, the web client can use one
Number (TSN), which provides integrated loss detection and
interface (B1), to connect to one interface (A1) at the web server.
recovery across streams; loss in one stream can be detected via
If A1 fails, the web server becomes unreachable to all the clients
acks for data on other streams. Also congestion control is shared;
connected through A1, including B, and the corresponding TCP
a web browser using this solution will be no more aggressive than
connections are aborted. Unfortunately, the redundant active
a web browser using a single TCP connection. Connection
network interface, A2, could not be used by the clients connected
establishment latency does not increase with multistreaming.
through A1.
While every association setup requires a four-way handshake, data
transfer can begin in the third leg (See Section 2.3).
In KAME SCTP implementation [12][14], the SCTP TCB is
approximately twice the size of a TCP TCB. The memory
overhead per inbound or outbound stream is 16 bytes, causing the
TCB memory requirements for two parallel TCP connections to
be roughly equal to the requirements for a single SCTP
association with two pairs (inbound and outbound) of streams.
However, to achieve higher concurrency, memory overhead when
increasing the number of TCP connections is much greater than Figure 2. Multihomed end hosts
when increasing the number of streams within an SCTP
association.
To provide applications on multihomed end hosts with resilience
to such failures, SCTP supports multihoming ─ a transport layer

3
feature providing transparent network failure detection and 3. When A receives the INIT-ACK, A replies with a COOKIE-
recovery. SCTP allows binding a transport layer association to ECHO, which echoes the cookie that B previously sent. This
multiple IP addresses at each end host. An end point chooses a COOKIE-ECHO may carry A’s application data to B.
single primary destination address for sending new data. SCTP 4. On receiving the COOKIE-ECHO, B checks the cookie’s
monitors the reachability of each destination address through two validity, using the state information in the cookie. If the cookie
mechanisms: acks of data and periodic probes known as verifies, B allocates resources and establishes the association.
heartbeats. Failure in reaching the primary destination results in
failover, where an SCTP endpoint dynamically chooses an
alternate destination to transmit the data, until the primary
destination becomes reachable again.
In Figure 2, a single SCTP association is possible between
addresses A1, A2 at the server and B1, B2 at the client. Assuming
A1 is the primary destination for the client, if A1 becomes
unreachable, multihoming keeps the SCTP association alive
through failover to alternate destination A2, and allows the end
host applications to continue communicating seamlessly.
Ongoing research on Concurrent Multipath Transfer (CMT) [15],
proposes to use multihoming for parallel load sharing. During
scenarios where multiple active interfaces between source and
destination connect through independent paths, CMT
simultaneously uses these multiple paths to transfer new data,
increasing throughput for a networked application. Thus, a
multihomed web client and server running on SCTP can leverage
CMT’s throughput improvements for web transfers.
2.3 SYN Attacks Figure 3. SCTP association establishment
A SYN attack is a common denial of service (DoS) technique that
has often disabled the services offered by a web server. During the
three-way TCP connection establishment handshake, when a TCP With SCTP’s four-way handshake, a web client that initiates an
server receives a SYN, the TCP connection transitions to the TCP association must maintain state before the web server does,
half-open state. In this state, the server allocates memory avoiding spoofed connection request attacks.
resources, stores state for the SYN received, and replies with a
SYN/ACK to the sender. The TCP connection remains half open 3. APACHE AND FIREFOX OVER SCTP
until it receives an ACK for the SYN/ACK resulting in To investigate the viability of HTTP over SCTP, we modified
connection establishment, or until the SYN/ACK expires with no Apache and Firefox to run over SCTP in FreeBSD 5.4 [12].
ACK. However, the latter scenario results in unnecessary These modified implementations are publicly available [20]. In
allocation of server’s resources for the TCP half open connection. this section, we list our design guidelines, and discuss the
When a malicious user orchestrates a coordinated SYN attack, rationale behind our final design. We then present the relevant
1000’s of malicious hosts flood a predetermined TCP server with architectural details of Apache and Firefox, and describe our
IP-spoofed SYN requests, causing the server to allocate resources changes to their implementation. Finally, we discuss their
for many half open TCP connections. The server’s resources are architectural limitations which do not allow the application to
thus held by these fabricated SYN requests, denying resources to fully benefit from SCTP multistreaming. These limitations are
legitimate clients. Such spoofed SYN attacks are a significant possibly shared by other web servers and browsers as well.
security concern, and an inherent vulnerability with TCP’s three-
way handshake. Web administrators try to reduce the impact of
3.1 Design Guidelines
such attacks by limiting the maximum number of half open TCP Two guidelines that governed our HTTP over SCTP design were:
connections at the server, or through firewall filters that monitor • Make no changes to the existing HTTP specification, to
the rate of incoming SYN requests. reduce deployment concerns
To protect an end host from such SYN attacks, SCTP uses a four- • Minimize SCTP-related state information at the server so that
way handshake with a cookie mechanism during association SCTP multistreaming does not become a bottleneck for
establishment. The four-way handshake does not increase the performance.
association establishment latency, since data transfer can begin in
An important design question to address was: which end (the
the third leg. As shown in Figure 3, when host A initiates an
client or server) should decide on the SCTP stream to be used for
association with host B, the following process ensues:
an HTTP response? Making the web server manage some form of
1. A sends an INIT to B. SCTP stream scheduling is not desirable, as it involves
2. On receipt of the INIT, B does not allocate resources to the maintaining additional state information at the server. Further, the
requested association. Instead, B returns an INIT-ACK to A with a client is better positioned to make scheduling decisions that rely
cookie that contains: (i) necessary details required to identify and on user perception and the operating environment. We therefore
process the association (ii) life span of the cookie, and (iii) concluded that the client should decide object scheduling on
signature to verify the cookie’s integrity and authenticity. streams.

4
We considered two designs by which the client conveys the request structure, and is exchanged between the APR and the core
selected SCTP stream to the web server: (1) the client specifies module through Apache’s storage buffers (bucket brigades).
the stream number in the HTTP GET request and the server sends Apache uses a configuration file that allows users to specify
the corresponding response on this stream, or (2) the server various parameters. We made changes to the Listen directive
transmits the HTTP response on the same stream number on syntax in the configuration file so that a web administrator can
which the corresponding HTTP request was received. Design (1) specify the transport protocol – TCP or SCTP, to be used by the
can use just one incoming stream and several outgoing streams at web server.
the server, but requires modifications to the HTTP GET request
specification. Design (2) requires the server to maintain as many 3.3 Firefox
incoming streams as there are outgoing streams, increasing the We chose the Firefox (version 1.6a1) browser since it is a widely
memory overhead at the server. The KAME SCTP TCB uses 16 used open-source browser. In this section, we briefly discuss
bytes for every inbound or outbound stream. We considered this Firefox’s architecture, and its adaptation to work over SCTP
memory overhead per stream to be insignificant when compared streams.
to changes to HTTP specification, and chose option (2).
3.3.1 Architecture
3.2 Apache Firefox belongs to the Mozilla suite of applications which have a
We chose the Apache (version 2.0.55) open source web server for layered architecture. A set of applications, such as Firefox and
our task. In this section, we give an overview of Apache’s Thunderbird (mail/news reader), belong to the top layer. These
architecture, and their modifications to use SCTP streams. applications rely on the services layer for access to network
services. The services layer uses platform independent network
3.2.1 Architecture APIs offered by the Netscape Portable Runtime (NSPR) library in
The Apache HTTP server has a modular architecture. The main the runtime layer. NSPR maintains a methods structure with
functions related to server initialization, listen/accept connection function pointers to various I/O and other management functions
setup, HTTP request parsing, memory management are handled for TCP and UDP sockets.
by the core module. The remaining accessory functions such as
request redirection, authentication, dynamic content handling are Firefox has a multi-threaded architecture. To render a web page
performed by separate modules. The core module relies on the inside a Firefox tab, first the HTTP protocol handlers parse the
Apache Portable Runtime (APR), a platform independent API, for URL, and use the socket services to open a TCP connection to the
network, memory and other system dependent functions. web server. Once the TCP connection is setup, an HTTP GET
request for the web page is sent. After the web page is retrieved
Apache has a set of multi-processing architectures that can be and parsed, further HTTP requests for embedded objects are
enabled during compilation. We considered the following pipelined over the same TCP connection if the connection
architectures: (1) prefork ─ non-threaded pre-forking server and persists; else over a new TCP connection.
(2) worker ─ hybrid multi-threaded multi-processing server. With
prefork, a configurable number of processes are forked during In the version we used, Firefox never opened more than one TCP
server initialization, and are setup to listen for connections from connection for a simple transaction to the same web server.
clients. With worker, a configurable number of server threads and However when we requested multiple news-feeds from the same
a listener thread are created per process. The listener thread listens web server in different tabs (“Open all in tabs” feature, where
for incoming connections from clients, and passes the connection multiple pages are displayed concurrently), Firefox opened
to a server thread for request processing. multiple TCP connections to the same web server, one for each
tab.
In both architectures, a connection structure is maintained
throughout a transport connection’s lifetime. Apache uses filters 3.3.2 Changes
─ functions through which different modules process an incoming Adapting Firefox to work on SCTP streams involved
HTTP request (input filters) or outgoing HTTP response (output modifications in its services layer to open an SCTP socket instead
filters). The core module’s input filter calls the APR read API for of a TCP socket, and creating a new methods structure in NSPR
reading HTTP requests. Once the HTTP request syntax is verified, for SCTP related I/O and management functions.
a request structure is created to maintain state related to the HTTP
During SCTP association setup with the server, Firefox requests a
request. After processing the request, the core module’s output
specific number of SCTP input and output streams. (In SCTP, this
filter calls the APR send API for sending the consequent HTTP
request can be negotiated down by the server in the INIT-ACK.)
response.
For our purposes, the number of input streams is set to equal the
3.2.2 Changes number of output streams, thus assuring that the Firefox browser
To adapt Apache to use SCTP streams, the APR read and send receives a response on the same stream number as the one on
API implementations were modified to collect the SCTP input which it sends a request.
stream number on which a request is read, and to send the Our Firefox changes provide flexibility to do HTTP request
response on the corresponding output stream. During a request’s scheduling over SCTP streams. The current implementation picks
lifetime, a temporary storage place stores the SCTP stream SCTP streams in a round-robin fashion. Other scheduling
number for the request. The initial design was to use the socket or approaches can be considered in the future. For example, in a
connection structures for the purpose. But, pipelined HTTP lossy network environment, such as wide area wireless
requests from potentially different SCTP streams can be read from connectivity through GPRS, a better scheduling policy might be
the same socket or connection, overwriting previous information. ‘smallest pending object first’ where the next GET request goes
Hence these structures were avoided. In our implementation, on the SCTP stream that has the smallest sum of object sizes
stream information related to an HTTP request is stored in the pending transfer. Such a policy reduces the probability of HOL

5
blocking among the response for the most recent GET request and object arrive on a different SCTP stream. Hence, the loss of
the responses for previous requests transmitted on the same SCTP TPDU1 6 – object 2’s TPDU, does not block application delivery
stream. of objects 3, 4 or 5. Note that in figure 6, the initial four PDUs of
With Firefox’s current design, the choice of the transport protocol object 4 are delivered without HOL blocking. The final TPDU of
(TCP or SCTP) must be decided at compile time. In the future, it object 4 is lost and is delivered only after the retransmission
will be beneficial to have this choice as a configurable parameter. arrives.
We believe that SCTP multistreaming and the absence of HOL
3.4 SCTP Multistreaming Avoids HOL blocking opens up opportunities for a new range of browser and
We present two simple experiments to visualize the differences server features, which we discuss in detail in the following
between the current HTTP over TCP design, and our HTTP over sections.
SCTP multistreaming design. Our goal is to demonstrate how
HTTP over SCTP multistreaming avoids HOL blocking.
25
The experiment topology, shown in Figure 4, uses three nodes: a # - PDU of object # received at Transport
X - PDU delivered to Application Object 5
custom web client (FreeBSD 5.4) and an Apache server (FreeBSD
5.4) connected by Dummynet (FreeBSD 4.10) [24]. Dummynet’s 20

traffic shaper configures a 56Kbps duplex link, with a queue size Object 4
of 50KB and zero added propagation delay between client and

TCP TPDUs
15
server. This link has no loss in the direction from client to server,
and 10% loss from server to client. Object 3

10

Object 2

Object 1

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Time (sec)

Figure 5. HOL blocking in HTTP over TCP

25
# - PDU of object # received at Transport
X - PDU delivered to Application Object 5

20

Object 4
Figure 4. Experiment Topology
SCTP TPDUs

15

Object 3
In both experiments, the client requests a web page containing 5
10
embedded 5.5KB objects (for example, a photo album page
containing 5 embedded JPEG images) from the Apache server. In Object 2

the first experiment, the web client and Apache communicate over 5

a single TCP connection, and, in the second they communicate Object 1


over a single SCTP association with one stream for each
0
embedded object. 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Using timestamp information collected from tcpdump [25] traces Time (sec)

at the client, figures 5 and 6 plot PDU receipt times at the


Figure 6. No HOL blocking in HTTP over SCTP
transport and application layers in the TCP and SCTP runs,
respectively. A point labeled ‘n’ denotes the arrival of one of
object n’s TPDUs at the receiving transport layer. A
corresponding ‘X’ denotes the earliest calculated time when the 3.5 Browser/Server Architectural Discussion
data in that TPDU is delivered by the transport layer to the Even if SCTP can deliver requests and responses of independent
application. web objects without HOL blocking, the current Apache and
Firefox architectures are unable to take full advantage of SCTP’s
In both scenarios, TPDU 6 (2nd TPDU of object 2) is lost, and its multistreaming benefits. We explain a browser side architectural
retransmission arrives just after time=4 seconds. This loss causes limitation, and propose a solution. We also explain a server side
the remaining TPDUs to arrive ‘out-of-order’ at the client’s
transport layer. In HTTP over TCP (Figure 5), HOL blocking by
object 2, causes TCP to delay delivery of data in objects 3, 4 and
1
5 until the successful retransmission of TPDU 6. Note that even In our SCTP experiment, each application write generated an SCTP
after this retransmission, TCP is still blocked from delivering TPDU, causing a one to one correspondence between a TPDU’s
object 5 to the application due to loss of TPDU 19 (5th TPDU of Transmission Sequence Number (TSN) and the TPDU number.
4th object). In HTTP over SCTP (Figure 6), the TPDUs for each

6
architectural change that can enhance the server’s performance, association. For example, object interleaving will be observed
especially in lossy environments. when a multi-threaded browser and server, modified as described
in Section 3.5, communicate over SCTP streams. Since such
3.5.1 Browser Limitation browser and server implementations are in progress, we use
If SCTP has received partial data for n independent web objects imaginary data to illustrate the concept.
on different streams, SCTP will deliver these n partial responses
to the web browser as long as TPDUs within each stream arrived We use two scenarios in our demonstration. Each scenario shows
in sequence. A web browser now has the opportunity to read and one of the two extreme cases ─ the presence of an ideal object
render these n responses concurrently. This browser capability, interleaving, and no interleaving. In both scenarios, a multi-
known as parallel rendering, can be optionally used to improve threaded browser requests 5 objects from a multi-threaded web
user perception since multiple web objects start appearing in server. Every object is the same size and is distributed over 5
parallel on the corresponding web page. TPDUs, resulting in a total of 25 TPDUs for each transfer. The
transfers do not experience any loss or propagation delay. The
Parallel rendering is difficult to realize with the current Firefox transmission time for each TPDU is around 180ms, resulting in a
architecture. Firefox dedicates a single thread to a transport layer total transfer time of ~4.4 seconds.
connection. This design reflects Firefox’s assumption regarding
TCP as the underlying transport. With TCP, objects can be In the first scenario, the multi-threaded browser and server are
received only sequentially within a single connection; hence a adapted as discussed in Section 3.5. The browser uses 5 threads to
single thread to read the HTTP responses is sufficient. In the send GET requests concurrently on 5 SCTP streams. Due to this
modified implementation, a single thread gets dedicated to an concurrency, the GET requests get bundled into SCTP TPDUs at
SCTP association. Consequently, the thread sends the pipelined the browser’s transport layer. For our illustration, we consider an
HTTP GET requests, and reads the responses in sequence. ideal bundling where all 5 requests get bundled into one TPDU.
Therefore, multiple streams within an SCTP association are still When this TPDU reaches the server’s transport layer, multiple
handled sequentially by the thread, allowing the thread to render server threads concurrently read the 5 requests from SCTP and
at most one response at a time. send responses. The concurrency in sending responses causes
TPDUs containing different objects to get interleaved at the
One possible solution to realize parallel rendering in Firefox (or server’s, and hence the browser’s transport layer, causing object
any multi-threaded web browser) is to use multiple threads to interleaving. Note that the degree of object interleaving depends
request and render web page objects via one SCTP association. on (1) the browser’s request writing pattern, which dictates how
Multiple threads, one for each object, send HTTP GET requests requests get bundled into SCTP TPDUs at browser’s transport
over different SCTP streams of the association. The number of layer, and (2) the sequence in which the server threads write the
SCTP streams to employ for a web page can be either user responses for these requests.
configurable, or dynamically decided by the browser. The same
thread that sends the request for an object can be responsible for For the second scenario, the multi-threaded browser and server
rendering the response. However, it is necessary that a single are adapted to use SCTP multistreaming, but do not have the
‘reader’ thread reads all the HTTP responses for a web page, since necessary modifications to concurrently send requests or
TPDUs from a web server containing the different responses can responses. The browser uses a single thread to sequentially send
arrive interleaved at the browser’s transport layer (discussed in the 5 GET requests over 5 SCTP streams. Each request gets
Section 3.6). translated to a separate SCTP PDU at the browser’s transport
layer. These 5 SCTP PDUs, and hence the 5 HTTP requests arrive
Multiple threads enable parallel rendering but require in succession at the web server, which uses a single thread to read
considerable changes to Firefox’s architecture. We suspect most and respond to these requests. These responses arrive sequentially
common web browsers to suffer from a similar architectural at the browser’s transport layer.
limitation.
Figure 7 illustrates the ideal object interleaving at the browser’s
3.5.2 Server Enhancement transport layer, where the first 5 TPDUs are the first TPDUs of all
The original multi-threaded Apache dedicates one server thread to the 5 responses. The next 5 TPDUs correspond to the second
each TCP connection. Our adaptation over SCTP multistreaming TPDUs of the 5 responses and so on. Figure 8 illustrates the
dedicates a server thread to an SCTP association. In this design, scenario of no object interleaving, and shows how TPDUs
the server thread reads HTTP requests sequentially from the corresponding to the 5 responses are delivered one after the other
association, even if requests arrive on different streams in the to the browser.
association. A browser can optionally take advantage of object interleaving to
Apache might achieve better concurrency in serving user requests progressively render these 5 objects in parallel, vs. complete
if its design enabled multiple threads to read from different SCTP rendering of each object in sequence. For example, a browser can
streams in an association, each capable of delivering independent render a piece of all 5 objects by time=0.75 seconds (Figure 7) vs.
requests without HOL blocking. We hypothesize that in lossy complete rendering of object 1 (Figure 8). By time=2.75 seconds,
and/or low bandwidth environments, this design can provide more than half of all 5 objects can be rendered in parallel with
higher request service rates when compared to Apache over TCP, object interleaving vs. complete rendering of objects 1 through 3
or our current Apache over SCTP. in case of no interleaving. The dark and the light rectangles in
Figure 7, help visualize the interleaving, and thus the progressive
3.6 Object Interleaving appearance of objects 2 and 4 on a web page.
In this section, we use “imaginary” scenarios to illustrate object Apart from progressive parallel rendering in web browsers,
interleaving. Object interleaving ensues when a browser and a HTTP-based network applications can take advantage of object
server, capable of transmitting HTTP requests and responses interleaving in other possible ways. For example, if a critical web
concurrently, communicate over different streams of an SCTP

7
client can make better decisions using progressive pieces of all knowledge about the type of objects being transferred. Such
responses vs. complete responses arriving sequentially, the web knowledge can be either implicit or explicitly obtained from the
application’s design can gain from object interleaving. web server.

25
4. OTHER MULTISTREAMING GAINS
# - PDU of object # received at Application We now consider two other web scenarios where SCTP
Fifth TPDU of all objects multistreaming might provide a better solution than existing TCP-
20 based solutions.
Fourth TPDU of all objects
4.1 Multiplexing User Requests
SCTP PDUs

15
Several web server farms and providers of Internet service use
Third TPDU of all objects TCP connection multiplexers to improve efficiency [16]. The
10 main goal of these multiplexers is to decrease the number of TCP
connection requests to a server, and thereby reduce server load
Second TPDU of all objects
due to TCP connection setup/teardown and state maintenance.
5
The multiplexer, acting as an intermediary, intercepts TCP
First TPDU of all objects connection open requests from different clients, and multiplexes
0 HTTP requests from different clients onto a set of existing TCP
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 connections to the server.
Time (sec)
In this scenario, a multiplexer is forced to maintain several open
Figure 7. HTTP over SCTP with object interleaving connections to its web server to avoid HOL blocking between
independent users’ requests and responses. Hence, a tradeoff
.
exists in deciding the number of open connections ─ fewer
25
# - PDU of object # received at Application
connections decrease the server load on connection maintenance,
whereas more connections reduce HOL blocking between
20
different users’ requests.
SCTP multistreaming can be leveraged to reduce both HOL
blocking and server load in such an environment. A proxy in front
SCTP TPDUs

15
of an SCTP-capable web server can intercept incoming SCTP
association open requests from different users. This proxy can
10 maintain just one SCTP association to the web server, and can
channel incoming requests from different users on different SCTP
streams within this association. Since SCTP multistreaming
5
avoids HOL blocking, this solution is equivalent to having a
separate session or connection per user. This setup incurs minimal
0 resource consumption at the server since all data between proxy
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 and server go over a single SCTP association. This design also
Time (sec) takes advantage of integrated congestion management and loss
recovery within the SCTP association (Section 2.1).
Figure 8. HTTP over SCTP without object interleaving
There could be scenarios where a web server runs on SCTP to
take advantage of its many features, but a web browser does not
We point out that object interleaving between a browser and have SCTP support. To facilitate seamless service to such
server communicating over TCP is infeasible without explicit browsers, we can extend the multiplexing proxy to act as an
application level markers that differentiate TPDUs belonging to application level gateway between HTTP-over-TCP and HTTP-
different interleaved objects. We feel that such markers try to over-SCTP implementations. The proxy can intercept TCP
emulate SCTP multistreaming at the application layer. Also, loss connection open requests, multiplex user requests on different
of a single TCP PDU in an interleaved transfer exacerbates the streams of a single SCTP association to the server, and forward
HOL blocking since the loss blocks application delivery of server responses to the clients on TCP. This setup ensures the
multiple objects. benefits of SCTP multistreaming at the server side, even when the
Browser architectures that facilitate object interleaving can be web clients are not SCTP-aware.
designed such that the browser is able to control the amount of
interleaving for each web transfer. For example, Section 3.5 4.2 Multiplexing Resource Access
modifications to a multi-threaded browser will empower it with Today’s web servers deliver much more to users than just
such flexibility as follows. If the browser uses a single thread to browsing content. For example, business services such as
send GET requests sequentially on different SCTP streams, the financial planning and tax preparation are offered over the web,
responses will arrive without any interleaving, as shown in Figure and the user accesses these services through a web browser. There
8. On the other hand, if the browser uses multiple threads to send are also web applications such as online games and web-based
the requests concurrently on different SCTP streams, the TPDUs mail that are accessible by a browser. In such web applications, a
will arrive interleaved as shown in Figure 7. With this flexibility, user first establishes a session with the server, and the bulk of the
the web browser can make on-the-fly decisions about how much user’s data is stored and processed at the server.
object interleaving to beget for each web transfer based on prior

8
Most organizations rely on third-party data centers to host and SCTP by using a single SCTP association to transmit both
maintain their web-based software services. For load sharing and types of data. As opposed to UDP’s best effort transmission,
better performance, a data center might employ various which burdens the application to implement its own loss
scheduling policies to logically group and host many web detection and recovery, messages can be transmitted reliably
applications on a server. Consider a policy where multiple web using SCTP’s unordered service.
applications that will be accessed by the business clients or
• SCTP shim layer: To encourage application developers and
employees of a single organization are grouped and hosted on the
end users to widely adopt SCTP and leverage its benefits, a
same web server. For example, the data center might host an
TCP-to-SCTP shim layer has been developed [22]. The shim
organization’s customer relationship management software and its
is a proof of concept and translates application level TCP
mail server on the same web server. In such a case, the employees
system calls into corresponding SCTP calls. By using such a
of the organization will access the two resources concurrently
shim layer, a legacy TCP-based web application can
from the web server. Instead of opening separate TCP connections
communicate using SCTP without any modifications to the
for each resource, the user’s browser and the web server can
application’s source code.
multiplex the resource access on different streams of a single
SCTP association, reducing load at the server.
6. CONCLUSION
Though SCTP has TCP-like congestion and flow control
5. OTHER USEFUL SCTP FEATURES mechanisms targeted for bulk data transfer, we argue that SCTP’s
Apart from multistreaming, multihoming and protection from
feature-set makes it a better web transport than TCP.
SYN attacks, we present other features and related work on SCTP
Performance-wise, SCTP’s multistreaming avoids TCP’s HOL
which we believe could be useful to HTTP-based network
blocking problem when transferring independent web objects, and
applications or web applications.
facilitates aggregate congestion control and loss recovery.
• Preservation of message boundaries: SCTP offers a Functionality-wise, SCTP’s multihoming provides fault-tolerance
message-oriented data transfer to an application, as opposed and scope for load balancing, and a built-in cookie mechanism in
to TCP’s byte stream data transfer. SCTP considers data SCTP’s association establishment phase provides protection
from each application write as a separate message. This against SYN attacks.
message’s boundary is preserved since SCTP guarantees We shared our experiences in adapting Apache and Firefox for
delivery of a message in its entirety to a receiving SCTP multistreaming, and demonstrated the potential benefits of
application. Web applications where the client and server HTTP over SCTP streams. We also presented current architectural
exchange data as messages can benefit from this feature, and limitations of Apache and Firefox that inhibit them from
avoid using explicit application level message delimiters. completely realizing the benefits of multistreaming.
• Partial Reliability: RFC3758 describes PR-SCTP, a partial We discussed other systems on the web where SCTP
reliability extension to RFC2960. This extension enables multistreaming may be advantageous, and hypothesized the
partially reliable data transfer between a PR-SCTP sender potential gains of using SCTP in such areas. We also outlined
and receiver. In TCP, and plain SCTP, all transmitted data other relevant SCTP features that are useful to HTTP based
are guaranteed to be delivered. Alternatively, PR-SCTP gives network applications.
an application the flexibility to notify how persistent the
transport protocol should be in trying to deliver a particular The authors hope that this position paper raises interest within the
message, by allowing the application to specify a “lifetime” web community in using SCTP as the transport protocol for web
for the message. A PR-SCTP sender tries to transmit the technologies, and welcome further research and collaboration
message during this lifetime. Upon lifetime expiration, a PR- along these lines.
SCTP sender discards the message irrespective of whether or
not the message was successfully transmitted. This timed
7. ACKNOWLEDGMENTS
reliability in data transfer might be useful to web applications The authors thank Armando L. Caro Jr. (BBN Technologies),
that regularly generate new data obsolescing earlier data, for Ethan Giordano, Mark J. Hufe, and Jonathan Leighton (University
example, an online gaming application, where a player of Delaware’s Protocol Engineering Lab), and the reviewers of
persistently generates new position coordinates. A game WWW2006 for their valuable comments and suggestions.
client can use PR-SCTP, and avoid transmitting the player’s
older coordinates when later ones are available, thereby
8. REFERENCES
reducing network traffic and processing at the game server. [1] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H.
Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V.
• Unordered data delivery: SCTP offers unordered data Paxson, “Stream Control Transmission Protocol,” RFC 2960,
delivery service. An application message, marked for 10/00
unordered delivery, is handed over to the receiving
application as soon as the message’s TPDUs arrive at the [2] R. Stewart, Q. Xie, Stream Control Transmission Protocol
SCTP receiver. Since TCP preserves strict data ordering, (SCTP): A Reference Guide, Addison Wesley, 2001, ISBN:
using a single TCP connection to transmit both ordered and 0-201-72186-4
unordered data results in unwanted delay in delivering the [3] R. Fielding et al., “Hypertext Transfer Protocol – HTTP/1.1,”
unordered data to the receiving application. Hence, RFC 2616, 6/99
applications such as online game clients that need to transmit
both ordered and unordered data open a TCP connection for
[4] R. Braden, “Requirements for Internet hosts –
communication layers,” RFC1122, 10/89
the ordered data, and use a separate UDP channel to transmit
the unordered data [23]. These applications can benefit from

9
[5] Z. Wang, P. Cao, “Persistent connection behavior of popular [19] M. Allman, V. Paxson, W. Stevens. “TCP Congestion
browsers,” Research Note, 12/98, Control,” RFC 2581, 4/99
www.cs.wisc.edu/~cao/papers/persistent-connection.html
[20] Protocol Engineering Lab, U Delaware, URL:
[6] H. Balakrishnan, H.S. Rahul, S. Seshan, “An integrated www.pel.cis.udel.edu/
congestion management architecture for Internet hosts,”
[21] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, P. Conrad,
ACM SIGCOMM, Cambridge, 8/99
“Stream Control Transmission Protocol (SCTP) Partial
[7] V. N. Padmanabhan, “Addressing the challenges of web data Reliability Extension,” RFC 3758, 5/04
transport,” PhD Dissertation, Comp Sci Division, U Cal
[22] R. Bickhart, “SCTP shim for legacy TCP applications”, MS
Berkeley, 9/98
Thesis, Protocol Engineering Lab, U Delaware, 8/05
[8] H. Balakrishnan, V. N. Padmanabhan, S. Seshan, M. Stemm, [23] Blizzard Entertainment, Technical Support Site, URL:
R. Katz, “TCP behavior of a busy Internet server: Analysis
www.blizzard.com/support/
and Improvements,” IEEE INFOCOM, San Francisco, 3/98
[24] L. Rizzo, “Dummynet: A simple approach to the evaluation
[9] The Apache Software Foundation, www.apache.org of network protocols,” ACM CCR, 27(1), 1/97
[10] Netcraft Web Server Survey, [25] TCPDUMP Public Repository, www.tcpdump.org/
news.netcraft.com/archives/web_server_survey.html
[26] PCWorld.com – Firefox Downloads Top 100 Million, URL:
[11] Mozilla Suite of Applications, www.mozilla.org
www.pcworld.com/news/article/0,aid,123140,00.asp
[12] The KAME Project, www.kame.net/
[27] D. Reed, email to end2end-interest mailing list, 10/02. URL:
[13] V. Jacobson, “Congestion avoidance and control,” ACM www.postel.org/pipermail/end2end-interest/2002-
SIGCOMM, Stanford, 8/88 October/002434.html
[14] Stream Control Transmission Protocol, www.sctp.org/ [28] J. Gettys, email to end2end-interest mailing list, 10/02. URL:
[15] J. Iyengar, P. Amer, R. Stewart, “Concurrent multipath www.postel.org/pipermail/end2end-interest/2002-
transfer using SCTP multihoming over independent end-to- October/002436.html
end paths,” IEEE/ACM Trans on Networking (to appear) [29] J. Gettys, H. Nielsen, “The WebMUX Protocol,” URL:
[16] Accelerated Traffic Management, Array Networks, www.w3.org/Protocols/MUX/WD-mux-980722.htm
www.arraynetworks.net/products/TMX1100.asp [30] HTTP-NG working group (historic). URL:
[17] Braden, R., “Transaction TCP - Concepts,” RFC 1379, 9/92 www.w3.org/Protocols/HTTP-NG/

[18] D.M. Chiu R. Jain. “Analysis of the increase and decrease


algorithms for congestion avoidance in computer networks,”
Computer Networks and ISDN Systems, 17(1):1-14, 6/89

10

View publication stats

You might also like