Ebook The Definitive Guide To Cloud Acceleration - HTM
Ebook The Definitive Guide To Cloud Acceleration - HTM
Ebook The Definitive Guide To Cloud Acceleration - HTM
Cloud Acceleration
Dan Sullivan
sponsored by
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Scalability
...........................................................................................................................................................
1
Self-Service
........................................................................................................................................................
2
Pay-for-Service
Model
..................................................................................................................................
4
Differences
with
Pre-Cloud
Architectures
...........................................................................................
4
i
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Summary
...............................................................................................................................................................
19
Chapter
2:
How
Websites
and
Web
Applications
Work
........................................................................
20
ii
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Congestion
..................................................................................................................................................
60
Summary
...............................................................................................................................................................
60
Chapter
4:
Multiple
Data
Centers
and
Content
Delivery
.......................................................................
61
iii
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
iv
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
v
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
vi
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Copyright Statement
2013 Realtime Publishers. All rights reserved. This site contains materials that have
been created, developed, or commissioned by, and published with the permission of,
Realtime Publishers (the Materials) and this site and any such Materials are protected
by international copyright and trademark laws.
THE MATERIALS ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE,
TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice
and do not represent a commitment on the part of Realtime Publishers its web site
sponsors. In no event shall Realtime Publishers or its web site sponsors be held liable for
technical or editorial errors or omissions contained in the Materials, including without
limitation, for any direct, indirect, incidental, special, exemplary or consequential
damages whatsoever resulting from the use of any information contained in the Materials.
The Materials (including but not limited to the text, images, audio, and/or video) may not
be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any
way, in whole or in part, except that one copy may be downloaded for your personal, non-
commercial use on a single computer. In connection with such use, you may not modify
or obscure any copyright or other proprietary notice.
The Materials may contain trademarks, services marks and logos that are the property of
third parties. You are not permitted to use these trademarks, services marks or logos
without prior written consent of such third parties.
Realtime Publishers and the Realtime Publishers logo are registered in the US Patent &
Trademark Office. All other product or service names are the property of their respective
owners.
If you have any questions about these terms, or if you would like information about
licensing materials from Realtime Publishers, please contact us via e-mail at
[email protected].
vii
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Scalability
Scalability
implies
the
ability
to
shift
the
amount
of
computing
and
storage
as
needed
to
meet
current
needs.
For
example,
if
a
business
experiences
a
spike
in
demand
for
one
of
its
Web
applications,
the
business
might
need
to
bring
additional
servers
online
to
respond
to
all
requests
in
an
acceptable
time.
In
a
cloud,
these
additional
servers
are
already
physically
present
in
a
data
center.
A
cloud
operating
system
(OS)
is
typically
in
place
to
deploy
virtual
images
to
additional
servers
and
reconfigure
load
balancers,
if
required,
to
include
the
additional
servers
in
an
application
cluster
(see
Figure
1.1).
1
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
1.1:
Clouds
provide
for
rapid
scalability.
Scalability
implies
the
ability
to
rapidly
downsize
resources
as
well.
In
the
given
example,
when
the
spike
in
traffic
subsides,
some
of
the
servers
would
be
released
from
the
cluster
and
returned
to
the
pool
of
cloud
resources
for
other
applications
or
customers
to
use
as
needed.
Storage
services
are
treated
in
an
analogous
way
in
cloud
computing.
As
more
storage
is
required,
it
is
allocated
from
a
shared
pool
of
storage
resources.
When
it
is
no
longer
needed,
storage
is
returned
to
the
pool
for
others
to
use.
Self-Service
Prior
to
the
advent
of
cloud
computing,
when
an
application
administrator
needed
to
scale
computers
to
an
application
cluster
or
upgrade
a
server,
it
meant
submitting
requests
to
systems
administrators
and
possibly
provisioning
additional
hardware.
Cloud
computing
platforms
provide
end
users
with
the
ability
to
provision
servers
and
storage
as
needed
through
a
cloud
administration
interface
(see
Figure
1.2).
Typically,
these
interfaces
allow
users
to
specify:
2
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
As
clouds
are
virtualized
computing
resources,
cloud
providers
can
offer
a
wide
range
of
machine
configurations.
For
example,
a
small
server
might
include
1
core,
2GB
of
memory,
and
200GB
of
local
storage,
while
a
higher-end
server
might
include
8
cores,
32GB
of
memory,
and
1TB
of
local
storage.
Cloud
users
can
choose
the
optimal
configuration
based
on
costs
and
requirements.
CPU
and
memory-intensive
applications
might
require
a
large
and
more
costly
server,
while
another
application
could
be
more
cost
effectively
run
on
a
number
of
low
CPU/low
memory
virtual
machines.
Cloud
providers
also
maintain
a
catalog
of
virtual
images.
These
can
include
a
variety
of
OSs
and
preconfigured
applications.
If
business
analysts
frequently
work
with
a
set
of
ad
hoc
reporting,
statistical
analysis,
and
visualization
tools,
the
cloud
provider
can
deploy
a
virtual
image
with
these
applications
installed
and
configured
so
that
they
are
readily
available
when
needed.
Figure
1.2:
Self-service
allows
non-IT
users
to
configure
their
own
computing
and
storage
resources.
3
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Pay-for-Service
Model
Another
distinguishing
feature
of
cloud
computing
is
the
pay-for-service
model.
Instead
of
buying
dedicated
hardware
for
an
application,
application
managers
now
have
the
option
of
essentially
renting
resources
when
those
resources
are
needed,
and
paying
for
only
what
is
used.
Servers
are
typically
billed
in
hour
or
minute
time
increments.
The
per-unit-of-time
charge
will
vary
with
the
virtual
machine
configuration
and
can
range
from
pennies
to
dollars
per
hour
per
machine.
Storage
is
usually
charged
based
on
the
amount
of
storage
used
and
the
length
of
time
data
is
stored.
4
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
1.3:
Prior
to
virtualization,
it
was
common
practice
to
dedicate
a
physical
server
to
a
single
application
or
task.
Virtualization
allows
for
multiple
applications
to
run
on
a
single
server
while
still
maintaining
OS
isolation.
5
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Categorizing
Clouds
Cloud
computing
services
can
be
categorized
according
to
who
is
granted
access
to
the
cloud
and
by
the
types
of
services
offered
by
the
cloud.
Public
cloud
Private
cloud
Hybrid
cloud
Each
of
these
deployment
models
has
its
benefits
and
drawbacks.
Public
Clouds
Public
clouds
are
essentially
open
to
any
user.
Many
cloud
providers
are
well
known
in
the
IT
industry
and
include
Amazon,
Microsoft,
Google,
IBM,
HP,
and
Rackspace.
One
of
the
advantages
of
a
public
cloud
is
the
low
barrier
to
entry:
virtually
anyone
with
a
credit
card
can
set
up
an
account
and
provision
resources.
Also,
public
cloud
providers
have
the
advantage
of
specializing
in
cloud
services
offerings.
They
realize
economies
of
scale,
can
invest
in
specialists
to
design
and
maintain
their
infrastructure,
and
can
raise
the
capital
required
to
deploy
substantial
cloud
services.
Common
characteristics
of
public
cloud
providers
include:
6
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Some
businesses
may
not
allow
confidential
or
sensitive
data
to
reside
on
servers
or
storage
systems
outside
of
corporate
control
due
to
concerns
about
data
leaks
and
loss
of
confidentiality.
However,
data
can
be
readily
encrypted
before
it
leaves
corporate
control.
Depending
on
jurisdiction,
businesses
may
be
required
to
keep
confidential
and
private
information
within
the
jurisdiction
or
within
a
partner
jurisdiction
with
equivalent
privacy
protections.
Although
the
benefits
of
public
cloud
computing
are
well
understood,
for
some
business
cases,
a
private
cloud
may
be
a
more
appealing
option.
Private
Clouds
Private
clouds
are
controlled
by
organizations
behind
their
firewalls
and
limit
access
to
the
cloud
to
organization
members
or
partners.
Large
businesses
and
governments
can
have
the
need
for
and
resources
to
build
and
maintain
private
clouds.
Fortunately,
businesses
do
not
need
to
start
from
scratch
to
build
a
private
cloud;
IT
vendors
offer
cloud
computing
packages
that
include
the
hardware
and
software
required
for
a
private
cloud.
The
single
most
significant
benefit
of
a
private
cloud
is
that
the
organization
deploying
it
maintains
full
control:
One
option
for
private
clouds
is
to
locate
your
infrastructure
in
a
third-party
data
center.
This
option
affords
some
economies
of
scale
and
specialization
of
labor
with
regards
to
managing
the
physical
infrastructure
and
redundant
network
services.
The
business
still
retains
control
over
the
computing
and
storage
infrastructure,
so
many
of
the
benefits
of
an
on-premise
private
cloud
remain
in
place.
7
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Hybrid
Clouds
A
hybrid
cloud,
as
the
name
implies,
is
a
combination
of
private
and
public
clouds.
The
model
was
developed
by
the
desire
for
the
benefits
of
both
private
and
public
clouds.
In
a
hybrid
cloud,
jobs
and
data
that
need
to
stay
within
the
corporate
network
can
run
on
the
private
cloud
while
other
jobs
and
data
can
be
shifted
to
a
public
cloud
provider,
as
Figure
1.4
shows.
This
approach
can
reduce
the
demand
for
private
cloud
resources
and
therefore
reduce
the
capital
expenditure
needed
to
establish
a
private
cloud.
Maintaining
a
hybrid
cloud
introduces
challenges
not
encountered
with
the
other
models.
If
the
cloud
OSs
running
in
the
private
and
public
clouds
are
not
compatible,
you
might
find
yourself
maintaining
two
catalogs
of
virtual
images
as
well
as
two
access
control
systems.
Accounting
and
billing
might
also
require
different
systems
and
create
additional
work
to
integrate.
Using
the
same
cloud
OSfor
example,
OpenStackin
both
the
public
and
private
clouds
can
reduce
integration
challenges.
Compatible
cloud
OSs,
such
as
the
Amazon
AWS
platform
and
Eucalyptus,
are
not
the
same
but
use
common
APIs
that
can
reduce
the
challenges
to
implementing
a
hybrid
cloud.
8
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
1.4:
Hybrid
clouds
combine
private
and
public
clouds
and
allow
for
workloads
to
move
between
the
two.
Public,
private,
and
hybrid
clouds
can
all
be
used
to
deploy
services
for
the
benefit
of
customers,
partners,
and
employees.
The
choice
of
the
most
appropriate
access
model
will
vary
according
to
security,
compliance,
performance,
and
cost
constraints.
9
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Infrastructure
as
a
Service
IaaS
clouds
offer
access
to
virtual
servers,
storage,
and
related
services.
Cloud
users
provision
virtual
servers
and
storage
as
needed,
and
manage
all
aspects
of
the
infrastructure
at
the
OS
level
and
above
(see
Figure
1.5).
This
option
gives
users
substantial
control
over
the
size
of
virtual
servers
used,
the
software
installed,
and
the
way
storage
systems
are
utilized.
This
model
also
imposes
the
most
responsibility
on
the
cloud
users.
For
example,
software
engineers
using
a
public
cloud
for
development
would
need
to
select
an
appropriate-size
machine,
load
a
virtual
image
with
an
appropriate
OS,
install
additional
tools
if
needed,
and
configure
persistent
storage.
IaaS
solutions
are
good
choices
when
you
need
to
maximize
control
over
the
OS,
applications,
and
storage
options.
Alternatively,
if
you
need
less
control
over
the
infrastructure,
a
PaaS
cloud
may
be
a
suitable
option.
Figure
1.5:
Infrastructure
as
a
Service
provides
primarily
computing,
storage,
and
networking
services.
10
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Platform
as
a
Service
PaaS
clouds
provide
access
to
application
services
while
alleviating
the
need
for
device
management
(see
Figure
1.6).
For
example,
a
developer
might
use
a
PaaS
cloud
to
run
a
large
number
of
tests
on
a
new
software.
The
developer
can
choose
the
appropriate
number
of
preconfigured
servers
and
submit
the
job
without
needing
to
set
up
the
servers
themselves.
PaaS
can
also
reduce
the
time
required
to
set
up
and
manage
application
stacks.
Instead
of
setting
up
application
and
database
servers,
PaaS
users
can
use
the
application
and
data
management
platforms
provided
by
the
PaaS
cloud.
Google
App
Engine,
for
example,
allows
software
developers
to
run
their
Java
or
Python
applications
on
Google
infrastructure
without
the
need
to
manage
virtual
machines.
Microsoft
Windows
Azure
cloud
includes
a
relational
database
service,
Azure
SQL,
which
a
business
can
use
instead
of
managing
its
own
Microsoft
SQL
Server
instance.
The
lines
between
IaaS
and
PaaS
are
sometimes
blurred,
as
IaaS
providers
offer
services,
such
as
databases
and
messaging
services,
as
part
of
their
IaaS
services.
Figure
1.6:
Platform
as
a
Service
extends
the
IaaS
level
of
services
to
include
application
stack
services.
11
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Software
as
a
Service
The
third
category
of
cloud
service
type,
SaaS,
provides
fully
functional
applications
to
end
users.
Applications
as
different
as
word
processing
and
customer
relationship
management
(CRM)
are
available
from
SaaS
providers.
A
key
advantage
of
the
SaaS
model
is
that
users
do
not
have
to
manage
any
part
of
the
infrastructure.
Some
applications
will
require
end
users
to
configure
access
controls
and
program
options
and
other
application
settings,
but
the
SaaS
provider
manages
all
aspects
of
the
computing,
storage,
and
network
infrastructure,
as
Figure
1.7
illustrates.
Figure
1.7:
Software
as
a
Service
provides
turnkey
applications
that
minimize
the
demands
on
end
users
to
set
up
and
configure
the
application.
12
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
SaaS
has
created
opportunities
for
both
SaaS
consumers
and
SaaS
providers.
Users
of
SaaS
services
can
reduce
or
eliminate
the
need
to
maintain
specialized
applications
in-house
or
in
a
cloud.
For
example,
an
architecture
firm
using
a
SaaS
for
managing
its
financials
can
avoid
having
to
run
a
financials
package
in-house
and
may
be
able
to
reduce
the
number
of
staff
dedicated
to
supporting
the
financial
package.
SaaS
providers
have
opportunities
to
create
services
that
might
not
be
efficiently
implemented
within
a
single
organization.
For
example,
a
SaaS
that
provides
HIPAA-compliant
records
management
services
could
find
a
large
market
of
small
and
midsize
healthcare
providers
interested
in
their
services.
SaaS
providers
may
implement
their
applications
in
public,
private,
or
hybrid
clouds.
73%
of
mobile
device
users
report
encountering
Web
sites
that
were
slow
to
load
47%
of
consumers
expect
Web
pages
to
load
in
2
seconds
or
less
40%
abandon
sites
that
take
more
than
3
seconds
to
load
79%
of
shoppers
who
are
dissatisfied
with
the
sites
performance
are
less
likely
to
buy
from
that
site
again
Clearly,
the
responsiveness
of
an
application
can
have
a
direct
impact
on
customer
satisfaction,
loyalty,
and
ultimately
revenue.
13
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Software-based
Options
One
way
to
improve
performance
is
to
tune
application
code.
This
task
can
include:
Hardware
Options
The
cloud
also
allows
businesses
to
implement
a
well-known
but
sometimes
questionable
practice
of
throwing
more
hardware
at
the
problem.
Rather
than
review
and
revise
code,
it
might
be
faster
to
simply
scale
up
the
servers
that
are
running
the
code.
One
could
scale
vertically
by
deploying
the
application
to
a
server
with
more
cores
and
memory
and
faster
storage
devices.
Alternatively,
applications
that
lend
themselves
to
distributed
workloads
can
scale
horizontally.
This
action
entails
adding
additional
servers
to
a
load-balanced
cluster
and
allowing
the
load
balancer
to
distribute
the
work
among
more
servers.
Both
of
these
scenarios
can
help
improve
performance,
assuming
there
are
no
bottlenecks
outside
the
servers
(for
example,
the
time
required
to
perform
I/O
operations
on
a
storage
array).
If
I/O
performance
is
a
problem,
you
might
be
able
to
improve
performance
by
switching
to
faster
storage
technology.
14
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
15
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
1.8:
Global
data
centers
are
essential
for
geographically
distributing
replicated
content.
Businesses
can
deploy
and
maintain
their
own
data
centers
or
infrastructure
within
co-
location
facilities
around
the
globe.
Such
a
deployment
would
have
to
have
sufficient
global
reach
to
respond
to
customers,
employees,
and
business
partners
wherever
they
may
be.
These
deployments
would
also
have
to
include
sufficient
hardware
to
scale
to
meet
the
peak
demands
each
data
center
would
encounter.
Redundancy
Redundancy
is
another
consideration.
Hardware
fails.
Software
crashes.
Networks
lose
connectivity.
If
a
data
center
were
to
fail,
other
data
centers
around
the
globe
should
be
configured
to
respond
to
traffic
normally
handled
by
the
failed
site.
Redundancy
also
entails
maintaining
up-to-date
copies
of
content.
Replication
procedures
should
be
in
place
to
ensure
that
content
is
distributed
to
all
data
sites
in
a
timely
manner.
16
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
17
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
1.9:
The
rate
of
data
exchange
between
ISPs
will
depend
on
multiple
factors,
including
the
topology
of
the
network.
Congestion
at
the
links
between
ISPs
can
contribute
to
high
latency
in
global
Web
applications.
18
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Summary
Cloud
computing
is
creating
opportunities
for
businesses
to
expand
their
reach
to
a
global
scale.
The
cost
and
complexity
of
deploying
computing
and
storage
services
is
lowered
with
cloud
computing.
There
is
also
greater
flexibility
to
adapt
to
new
business
opportunities
by
leveraging
IaaS
and
PaaS
platforms
to
create
new
applications
and
services.
The
increasing
adoption
of
SaaS
platforms
also
presents
an
opportunity
for
businesses
to
offer
their
services
in
a
SaaS
model.
Businesses
must
pay
particular
attention
to
Web
application
performance
for
all
customers
regardless
of
those
customers
locations.
Adding
servers
and
storage
will
improve
some
but
not
all
aspects
of
application
responsiveness.
Cloud
acceleration
techniques
may
be
required
to
ensure
consistent
and
acceptable
levels
of
performance
for
all
application
users.
19
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Sending
the
domain
to
a
domain
name
service
(DNS)
server
to
map
the
URL
to
an
IP
address
Routing
the
Web
page
request,
possibly
over
a
series
of
different
Internet
providers,
to
the
Web
server
at
the
IP
address
provided
by
the
DNS
server
Retrieving
or
generating
the
content
of
the
Web
page
from
the
Web
server
Packaging
the
content
of
the
Web
page
into
a
series
of
TCP
packets
that
deliver
the
content
to
the
client
device
that
is
making
the
Web
page
request
Reconstructing
the
Web
page
from
packets
on
the
client
device
Rendering
the
Web
page
for
the
user
Retrieving
a
Web
page
might
sound
like
a
simple
operation,
but
there
are
clearly
multiple
steps
involved
and
each
of
them
can
introduce
delays
into
the
process.
If
a
DNS
server
is
unreachable
or
slow
to
respond,
the
client
device
making
the
request
might
need
to
query
a
different
DNS
server.
This
requirement
delays
the
start
of
the
process
to
retrieve
the
Web
page.
Traffic
that
moves
between
Internet
service
providers
(ISPs)
might
be
subject
to
congestion
when
routing
between
services
or
to
limits
on
the
speed
with
which
traffic
from
competing
ISPs
is
handled.
This
potential
bottleneck
can
delay
the
transmission
of
data
between
the
client
device
and
the
Web
server.
20
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
Web
server
itself
can
be
the
cause
of
delays.
If
the
Web
server
is
subject
to
heavy
loads,
there
might
be
a
long
queue
of
requests
waiting
to
be
processed.
If
Web
page
requests
require
a
substantial
number
of
I/O
operations
on
the
Web
server,
the
response
time
will
be
longer
than
if
those
operations
could
be
avoided.
The
way
the
TCP
protocol
functions
is
even
a
potential
cause
of
delays.
TCP
guarantees
delivery
of
packets.
To
meet
this
guarantee,
TCP
requires
more
communication
steps
to
ensure
packets
are
delivered.
If
some
packets
are
lost
or
delayed,
they
must
be
retransmitted.
The
need
for
retransmitting
and
the
delays
it
can
introduce
might
be
minimal
on
local
area
networks
(LANs),
but
moving
data
across
global-scale
networks
will
more
likely
entail
some
lost
packets.
You
must
also
consider
the
latency
of
long
distance
networks.
The
physical
limitations
of
network
speed
combined
with
network
traffic
and
ISP
polices
all
influence
the
time
required
to
transmit
data
between
a
client
device
and
a
Web
server.
The
speed
of
the
last-mile
connection
to
the
client
device
and
the
configuration
of
the
client
device
can
also
influence
the
speed
with
which
Web
content
is
rendered.
The
remainder
of
this
chapter
will
delve
into
potential
problems
in
more
detail,
in
particular:
21
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.1:
Web
pages
routinely
combine
text,
images,
video,
and
sound.
Web
pages
constructed
of
multiple
types
of
content
are
typically
built
from
components
stored
in
multiple
files.
For
example,
the
Web
page
depicted
in
Figure
2.1
includes
24
images.
Those
images
are
each
stored
in
separate
files
on
a
Web
server.
When
the
page
is
rendered
on
a
client
device,
the
client
issues
an
HTTP
GET
command
to
download
each
image.
Each
GET
command
requires
a
connection
to
the
Web
server
and
I/O
operations
on
the
server
to
retrieve
the
image
from
persistent
storage.
Note
Actually,
the
Web
server
might
not
have
to
retrieve
the
image
from
disk
or
other
persistent
storage
if
it
is
cached.
I
will
discuss
that
scenario
shortly.
Once
the
server
has
retrieved
the
content
for
each
of
the
24
images,
the
Web
server
transmits
each
image
to
the
client
device.
Each
transmission
is
subject
to
some
of
the
potential
problems
described
earlier
in
the
chapter,
including
congestion
and
lost
packets.
These
problems
are
especially
problematic
for
Web
pages
that
require
all
or
most
of
the
content
on
page
to
be
rendered
before
the
page
is
useful.
22
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Consider
a
Web
site
offering
weather
information.
A
page
that
allows
a
user
to
enter
a
location
to
check
todays
weather
would
be
functional
as
soon
as
the
search
box
and
related
text
are
rendered.
Other
images,
such
as
ad
displays,
could
load
while
the
user
enters
text
into
the
search
box;
there
is
little
or
no
adverse
effect
on
the
user.
If,
however,
a
user
loads
a
page
with
multiple
maps
and
satellite
images,
the
user
might
have
to
wait
for
multiple
images
to
load
to
find
the
specific
information
they
are
seeking.
When
a
Web
page
has
multiple
objects
that
require
separate
connections,
the
number
of
connections
required
determines
the
time
required
to
fully
load
a
page.
The
speed
of
a
download
is
determined
by
three
factors:
Bandwidth
Latency
Packet
loss
Each
of
these
factors
should
be
taken
into
account
when
analyzing
Web
page
load
times
(see
Figure
2.2).
Figure
2.2:
Loading
a
single
Web
page
can
require
multiple
files
and
therefore
multiple
connections
between
a
client
and
a
server
(Screenshot
of
Wireshark
Version 1.8.5).
23
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Network
traffic
can
adversely
affect
latency
if
the
volume
of
incoming
traffic
is
more
than
network
devices
can
process
at
the
time.
When
buffers
are
full,
a
device
may
send
a
single
packet
to
the
source
system
to
stop
transmitting
data.
This
situation
can
increase
the
RTT
on
that
connection.
The
greater
the
physical
distance
a
data
packet
must
travel,
the
longer
it
will
take.
In
addition,
longer
distances
usually
entail
additional
network
devices,
some
of
which
could
be
significantly
slower
than
others.
This
setup
can
introduce
bottlenecks
in
the
network
that
limit
the
overall
speed
to
the
speed
of
the
slowest
segment,
as
Figure
2.3
illustrates.
24
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.3:
Bottlenecks
in
the
network
can
slow
data
transmission
to
the
speed
of
the
slowest
segment
of
the
network.
Changes
in
TCP
protocol
configurations
can
affect
the
performance
of
a
network.
For
example,
high-bandwidth
networks
might
not
be
taking
full
advantage
of
the
bandwidth
available
if
devices
on
the
network
have
TCP
protocols
configured
to
send
smaller
amounts
and
then
wait
for
an
acknowledgment
before
sending
more
data.
Packet
loss
is
another
problem
sometimes
related
to
device
configuration
(see
Figure
2.4).
Devices
that
receive
data
from
a
network
connection
have
to
buffer
data
so
that
the
data
can
be
processed
by
an
application.
If
the
buffers
become
full
while
incoming
data
continues
to
arrive,
then
packets
can
be
lost.
The
TCP
protocol
is
designed
to
guarantee
delivery
of
packets.
Devices
that
send
packets
to
a
receiving
device
expect
an
acknowledgement
that
packets
have
been
received.
Sending
devices
are
configured
to
wait
a
certain
period
of
time,
and
if
no
acknowledgment
is
received
in
that
time,
the
sender
retransmits
the
unacknowledged
packets.
25
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.4:
When
applications
cannot
read
from
the
connections
data
buffer
fast
enough
to
keep
up
with
incoming
traffic,
then
packets
can
be
lost.
Data
loss
can
occur
for
similar
reasons
when
there
is
congestion
on
network
devices
between
the
source
and
destination
devices.
26
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Popularity
of
Content
In
addition
to
network
performance
problems,
the
speed
at
which
you
can
deliver
content
is
also
influenced
by
the
speed
of
Web
servers.
When
large
numbers
of
users
are
requesting
content
from
a
Web
server,
there
might
be
contention
and
congestion
for
resources.
For
example,
if
every
request
for
content
results
in
multiple
I/O
operations
to
retrieve
content
from
disk,
the
performance
level
of
the
disk
will
be
a
limiting
factor
in
the
responsiveness
of
your
Web
site.
Retrieving
content
from
disk
is
slower
than
retrieving
content
from
RAM.
One
way
to
improve
response
time
is
to
store
a
copy
of
content
in
RAM.
This
method
is
one
type
of
caching
technique
that
can
improve
Web
site
performance.
Before
delving
into
the
different
types
of
caching,
it
is
important
to
note
that
the
benefits
from
all
forms
of
caching
are
a
function
of
the
popularity
of
content.
The
reason
is
that
when
a
piece
of
content
is
retrieved
from
persistent
storage,
a
copy
is
saved
in
the
cache.
The
next
time
that
content
is
requested,
it
can
be
retrieved
from
the
cache
more
quickly
than
it
could
be
retrieved
from
disk.
27
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Very
low-popularity
content
is
far
less
likely
to
be
cached
than
is
popular
content.
As
a
result,
popular
content
can
be
delivered
faster
than
unpopular
content
can.
This
reality
is
certainly
better
than
consistently
slow
page
load
times,
but
it
can
lead
to
inconsistent
performance
in
a
users
experience
with
your
Web
site.
Browser
caching
Web
server
caching
Proxy
caching
Each
of
these
can
improve
Web
site
performance
with
static
content
but
they
work
in
different
ways
and
have
different
advantages
and
limitations.
Browser
Caching
Web
browsers
can
store
local
copies
of
content
on
a
client
device.
The
first
time
content
is
downloaded,
it
can
be
stored
in
the
cache.
The
next
time
it
is
needed,
the
content
is
loaded
from
cache
instead
of
downloading
from
the
Web
site
(see
Figure
2.5).
This
setup
saves
the
RTT
to
retrieve
the
content.
Browser
caching
helps
with
content
that
is
used
repeatedly
during
a
session.
Logo
images
and
CSS
files
that
may
be
used
throughout
a
site
need
to
be
downloaded
only
once
during
a
session.
28
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.5:
Browser
cache
can
improve
page
loading
times
on
individual
devices,
especially
for
content
reused
within
a
Web
site.
Unlike
browser
caching,
this
form
of
caching
does
not
eliminate
the
need
for
round-trip
communication
between
the
client
device
and
the
Web
server.
It
does,
however,
have
the
additional
benefit
of
caching
content
accessed
by
other
users,
which
is
not
the
case
with
browser-based
caching.
29
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.6:
Web
server
caching
pools
the
benefits
of
caching
across
multiple
users.
Note:
Green
ovals
represent
pages
retrieved;
white
ovals
are
pages
stored
but
not
retrieved.
Proxy
Caching
Proxy
caching
is
a
service
offered
by
ISPs
to
reduce
the
time
needed
to
load
content
on
client
devices
and
to
reduce
traffic
on
their
own
networks.
As
Figure
2.7
shows,
proxy
caching
works
by
keeping
copies
of
popular
content
on
the
ISPs
servers.
When
a
client
device
makes
a
request
for
content,
the
proxy
cache
is
queried
and,
if
the
content
is
found,
it
is
delivered
to
the
client
device
from
the
ISP.
This
setup
avoids
the
overhead
and
delay
encountered
when
the
request
has
to
be
sent
to
the
Web
sites
server.
30
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Proxy
caching
reduces
RTT
because
the
client
needs
only
to
wait
for
a
response
from
the
ISP.
This
type
of
caching
has
advantages
for
the
ISPs
customers.
It
cuts
down
on
the
load
on
the
customers
Web
site
and
reduces
the
traffic
on
the
customers
networks.
Customers
have
limited
control
over
the
proxy
cache
because
the
ISP
is
likely
using
the
cache
for
multiple
customers.
Figure
2.7:
Proxy
caching
reduces
RTT
by
shortening
the
distance
network
traffic
must
traverse
to
respond
to
a
request.
31
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Another
limitation
on
caching
is
the
amount
of
cache
available.
RAM
is
a
limited
resource
on
any
server
and
only
so
much
can
be
dedicated
to
caching
Web
content.
When
the
cache
is
full,
no
other
content
can
be
stored
there
or
some
of
the
content
must
be
deleted
to
make
room
for
newer
content.
This
requirement
can
be
handled
in
several
ways
(see
Figure
2.8):
A
simple
strategy
is
to
simply
remove
the
oldest
content
whenever
the
cache
is
full.
Old
content,
however,
might
be
popular
content,
so
a
better
approach
in
some
cases
is
to
delete
the
least
recently
used
content
without
respect
for
age
of
the
content.
Figure
2.8:
Cache
content
replacement
policies
optimize
for
different
measures,
such
as
frequency
of
use
or
object
size.
32
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Another
strategy
considers
content
size
rather
than
just
age
or
recent
usage.
The
idea
is
that
if
more
small
objects
are
kept
in
the
cache,
more
objects
can
be
stored.
In
turn,
this
setup
should
increase
the
rate
that
objects
are
found
in
the
cache.
Size,
age,
and
frequency
of
access
policies
can
be
combined
to
optimize
different
objectives
with
caching.
The
age
of
objects
in
the
cache
is
another
way
to
determine
when
an
object
should
be
deleted
from
the
cache.
A
parameter
known
as
time
to
live
(TTL)
determines
how
long
an
object
is
stored
in
the
cache
before
it
is
deleted.
A
cache
with
a
1200
second
TTL,
for
example,
would
keep
objects
in
the
cache
for
at-most
20
minutes
before
deleting
them.
An
alternative
to
caching
is
replication,
in
which
copies
of
content
are
stored
persistently
on
servers
distributed
across
the
Internet.
The
goal
is
to
keep
copies
of
content
closer
to
users
to
reduce
the
RTT
need
to
satisfy
requests
for
content.
It
also
has
the
added
benefit
of
reducing
load
on
the
Web
server
hosting
the
original
source
of
the
content.
Key
considerations
with
replication
are
the
number
and
location
of
replicated
servers
and
the
frequency
with
which
content
is
updated.
If
a
large
portion
of
your
Web
traffic
originates
in
Europe,
it
makes
sense
to
replicate
content
to
servers
located
there.
Other
factors
should
be
considered
as
well.
If
you
located
a
replicated
server
in
a
data
center
in
Amsterdam,
for
example,
would
customers
and
business
partners
in
Eastern
Europe
realize
the
same
performance
improvements
as
those
in
Western
Europe?
If
latency
and
congestion
are
problems
with
some
ISPs
in
Eastern
Europe,
you
might
want
to
deploy
a
replicated
server
in
such
as
way
as
to
minimize
the
traffic
over
that
ISP.
Replicated
content
will
need
to
be
refreshed
to
keep
all
content
consistent
across
servers.
Frequent,
incremental
updates
are
warranted
when
content
changes
often.
The
type
of
content
you
are
replicating
can
also
influence
your
decisions
about
update
frequency.
If
content
contains
legal
or
regulated
information,
such
as
forward-looking
statements
from
a
publicly
traded
company,
it
is
especially
important
to
minimize
the
chance
that
users
in
different
parts
of
the
world
would
see
different
versions
of
the
content
(see
Figure
2.9).
33
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
2.9:
Origin
of
traffic
and
network
conditions
influence
the
placement
and
number
of
replicated
servers.
Caching
and
replication
can
help
with
static
content
but
many
Web
pages
are
dynamically
generated.
For
dynamic
content,
you
need
to
consider
other
methods
to
optimize
delivery.
Before
considering
ways
to
optimize
dynamic
content,
lets
explore
the
steps
involved
in
generating
and
delivering
that
content,
taking
a
purchase
transaction
as
an
example.
34
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
When
a
customer
views
a
catalog
of
products,
the
content
may
be
served
from
a
cache
or
replicated
site.
Once
the
customer
selects
a
set
of
products
for
purchase,
they
will
begin
viewing
dynamically
generated
content.
Each
request
must
be
served
from
the
origin.
As
Figure
2.10
illustrates,
a
purchase
transaction
requires:
Figure
2.10:
Multiple
applications
can
contribute
to
the
construction
of
a
single
dynamically
generated
Web
page.
35
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
36
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Distance
to
Origin
The
distance
to
origin
is
another
factor
that
can
significantly
impact
the
speed
with
which
a
Web
application
can
respond
to
a
user.
The
greater
the
distance
between
a
client
and
a
server,
the
longer
the
time
required
to
transmit
data.
Such
is
the
case
regardless
of
other
network
conditions,
such
as
the
rate
of
packet
loss
or
congestion
on
the
network.
Transmitting
data
on
global
scales
on
the
Internet
requires
data
to
move
over
networks
managed
by
multiple
ISPs.
Those
ISPs
may
have
varying
policies
on
handling
traffic
traversing
their
networks
and
service
level
agreements
(SLAs)
with
their
own
customers
that
result
in
lower
priority
for
non-customer
traffic.
When
there
is
congestion
on
a
network,
there
is
a
risk
of
losing
packets.
The
TCP
protocol
expects
acknowledgements
for
packets
sent
and
waits
a
specified
period
of
time
for
that
acknowledgement.
If
an
acknowledgement
is
not
received,
the
packet
is
resent.
This
setup
introduces
a
two-fold
delay
in
the
Web
application:
The
time
the
TCP
client
waits
for
the
acknowledgement,
and
the
time
required
to
send
the
packet
again.
Understanding
the
distance
between
client
devices
and
the
system
of
origin
is
a
key
element
in
assessing
the
need
for
dynamic
content
optimization.
When
the
combination
of
the
number
of
requests
for
dynamic
content,
the
importance
of
the
dynamic
content,
and
the
distance
to
origin
warrants
optimized
dynamic
content,
then
it
is
time
to
consider
how
to
reduce
the
request
burden
on
the
system
of
origin.
37
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
There are multiple ways to improve the performance of the system of origin:
Web
server
caching
can
be
used
to
reduce
disk
I/O
operations
related
to
the
static
content
portion
of
the
application
Databases
can
be
tuned
to
improve
query
response
time
and
minimize
the
number
of
disk
read
and
write
operations
Application
code
can
be
analyzed
to
identify
time-consuming
operations
An
application
cluster
can
be
expanded
to
distribute
the
workload
over
a
larger
number
of
servers
These,
however,
will
not
directly
address
problems
with
delivering
dynamic
content
to
distant
client
devices.
Some
additional
performance
tuning
options
include:
TCP
Optimizations
TCP
optimizations
are
techniques
used
to
improve
the
overall
performance
of
the
TCP
protocol.
The
protocol
was
developed
early
in
the
history
of
internetworking
and
was
designed
to
provide
reliable
transmission
over
potentially
unreliable
networks.
Todays
network
can
realize
much
faster
transmission
speeds
than
those
available
when
TCP
was
first
designed.
A
number
of
different
kinds
of
optimizations
are
available:
38
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
selective
acknowledgment
optimization
can
help
reduce
the
amount
of
data
retransmitted
when
packets
are
lost.
TCP
was
originally
designed
to
use
something
called
a
cumulative
acknowledgement,
which
provided
limited
information
about
lost
packets.
This
functionality
can
leave
the
sender
waiting
additional
RTT
periods
to
find
out
about
additional
lost
packets.
Some
senders
might
operate
under
the
assumption
that
additional
packets
were
lost
and
retransmit
other
packets
in
the
segment.
Doing
so
can
lead
to
unnecessary
retransmission
of
packets.
Selective
acknowledgement
provides
more
information
about
lost
packets
and
helps
reduce
unnecessary
retransmission
of
data.
39
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Device-specific
content
Browser-specific
content
Geography-specific
content
Each
of
these
factors
can
lead
to
larger
volumes
of
static
content
and
require
more
complex
applications
for
generating
dynamic
content.
Businesses
should
consider
the
need
for
device-,
browser-,
and
geography-specific
content
as
they
assess
their
current
Web
applications
and
need
for
acceleration.
Summary
Maintaining
consistent
Web
application
performance
for
all
users
is
a
challenge.
Various
forms
of
caching
can
improve
performance
in
some
cases
but
some
cases
are
better
served
by
replicating
content
to
servers
closer
to
end
users.
Dynamic
content
requires
other
optimization
techniques
to
improve
overall
TCP
performance,
reduce
packet
loss,
and
cut
total
RTT.
40
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
41
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
data
that
needs
to
be
sent
from
the
customer
in
Singapore
to
the
data
center
in
the
United
States
is
packaged
into
Transmission
Control
Protocol
(TCP)
packets.
TCP
is
one
of
many
Internet
protocols,
but
it
is
one
of
the
most
important
from
an
application
perspective.
TCP
guarantees
delivery
of
data
packets
in
the
same
order
in
which
they
were
sent.
The
importance
of
this
feature
is
obvious.
Imagine
if
the
data
entered
for
a
transaction
sent
halfway
across
the
globe
was
broken
into
pieces,
transmitted,
and
arrived
at
the
destination
in
the
wrong
order.
When
the
packets
were
reassembled
at
the
destination,
you
might
find
that
the
order
quantity
was
swapped
with
the
customers
shipping
address.
TCP
prevents
this
kind
of
problem
by
implementing
controls
that
ensure
packets
are
properly
ordered
when
they
are
reassembled
at
their
destination.
Other
protocols
are
required
as
well
to
get
data
from
one
network
device
to
another.
Data
that
is
transmitted
over
long
distances
can
move
across
multiple
paths.
For
example,
data
arriving
from
the
west
coast
of
the
United
States
could
be
sent
to
an
east
cost
data
center
over
paths
across
the
northern
or
southern
United
States
or
could
be
sent
across
a
less
straightforward
route
(at
least
from
a
geographical
perspective).
Routing
protocols
are
used
to
determine
how
to
send
packets
of
data
from
one
point
in
the
Internet
to
the
next.
42
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
3.1:
The
most
efficient
route,
from
a
technical
perspective,
may
not
be
taken.
Organizational
and
business
relationships
between
network
providers
influence
the
path
taken
between
source
and
destination
devices.
Some
network
providers
have
created
large
advanced
networks
with
substantial
geographic
reach.
These
network
providers
can
make
arrangements
with
each
other
to
allow
the
transit
of
each
others
traffic
free
of
charge.
Sometimes
the
cross-network
routing
arrangement
is
not
equally
beneficial.
For
example,
a
small
network
provider
may
have
a
large
portion
of
its
traffic
sent
outside
the
network
while
a
large
provider
has
only
a
small
amount
of
data
routed
to
smaller
providers.
This
kind
of
asymmetry
undermines
the
logic
for
free
exchange
of
traffic.
43
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
When
there
is
significant
asymmetry
between
providers,
the
smaller
provider
will
likely
have
to
pay
some
kind
of
fee,
known
as
a
transit
charge,
to
the
larger
provider.
Obviously,
one
of
the
ways
these
smaller
providers
control
cost
is
by
limiting
as
much
as
possible
the
amount
they
have
to
pay
in
transit
fees.
Lets
go
back
to
the
example
with
network
providers
A,
B,
C
and
D.
The
most
efficient
route
from
a
device
on
Network
A
to
a
device
on
Network
D
is
through
Network
B.
Network
B
charges
50%
more
in
transit
charges
than
Network
C.
Network
A
is
a
midsize
network
provider
and
cannot
secure
peering
agreements
without
paying
fees
to
some
of
the
larger
network
providers.
From
a
technical
perspective,
Network
A
could
route
its
traffic
to
Network
D
over
Network
B;
from
a
business
perspective,
that
option
is
cost
prohibitive.
As
a
result,
a
customer
on
Network
A
sending
and
receiving
traffic
from
Network
D
depend
on
sub-optimal
routing.
44
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Measuring
Performance
If
your
customers
are
complaining
about
poor
performance
or
abandoning
shopping
carts
at
high
rates,
application
performance
might
be
a
problem.
Many
factors
influence
the
overall
speed
at
which
your
application
processes
transactions:
Like
software
development,
analyzing
network
problems
is
partly
a
matter
of
dividing
and
conquering.
The
goal
is
to
find
the
bottleneck
(or
bottlenecks)
on
the
network
between
your
servers
and
client
devices.
Is
the
problem
at
a
router
in
your
data
center?
Is
your
ISP
meeting
agreed-upon
service
levels
with
regard
to
your
network
traffic?
Does
the
problem
lie
with
networking
infrastructure
closer
to
the
client
devices?
To
find
out
where
network
problems
occur,
it
helps
to
have
standard
measures
that
allow
you
to
quantify
network
performance.
Customers
or
other
end
users
might
describe
problems
with
application
performance
in
qualitative
terms
such
as
slow
and
takes
too
long.
Sometimes
they
might
even
experience
errors
and
can
provide
qualitative
error
messages,
such
as
the
connection
timed
out.
These
kinds
of
descriptions
are
symptoms
of
an
underlying
problem
but
they
lack
specificity
and
precision.
45
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
This
discussion
will
focus
on
three
quantitative
measures
when
considering
network
performance:
Throughput
Latency
Packet
loss
These
measures
describe
characteristics
of
networks
that
can
influence
the
overall
performance
of
your
applications.
Throughput
Throughput
is
a
measure
of
data
packets
that
are
transmitted
over
a
network.
It
is
usually
described
in
bits
per
second.
We
often
describe
network
throughput
in
terms
of
a
single
quantity,
such
a
100Mbits
per
second.
In
practice,
network
throughput
can
vary
over
time
with
changes
in
network
conditions.
A
number
of
factors
can
influence
throughput.
Physical
characteristics
such
as
the
noise
on
the
line
can
interfere
with
signals
and
reduce
throughput.
The
speed
at
which
network
equipment
can
process
data
packets
can
also
impact
throughput.
The
Internet
Protocol
(IP)
packets,
for
example,
are
comprised
of
header
data
and
payloads.
The
network
devices
use
data
in
the
header
to
determine
how
to
process
a
packet.
Throughput
of
the
network
will
depend,
to
some
degree,
on
how
fast
network
devices
can
process
this
header
data.
We
can
draw
an
analogy
between
network
throughput
and
traffic
on
a
highway.
A
single
lane
highway
may
allow
only
one
car
at
a
time
at
any
point
on
the
highway,
a
two
lane
highway
can
have
two
cars
at
a
time,
and
so
one.
By
adding
more
lanes,
you
can
increase
the
number
of
cars
that
can
travel
over
the
highway.
When
cars
travel
at
the
top
speed
on
the
network
and
there
are
cars
in
every
lane,
you
can
reach
peak
throughput
of
the
highway.
Similarly,
with
networks,
you
can
reach
peak
throughput
when
you
fully
utilize
the
networks
carrying
capacity.
Peak
throughput
cannot
always
be
maintained,
though.
In
the
highway
analogy,
a
car
accident
or
slow
moving
car
can
reduce
the
speed
of
other
drivers.
In
the
case
of
the
network,
signal
degradation
or
delays
processing
data
packets
can
reduce
throughput
below
peak
throughput.
As
this
discussion
is
primarily
concerned
with
Web
application
performance,
it
is
helpful
to
work
with
average
or
sustained
throughput
over
a
period
of
time
rather
then
peak
throughput.
46
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Latency
In
addition
to
throughput,
which
measures
the
amount
of
data
you
can
transmit
over
a
particular
period
of
time,
you
should
consider
the
time
required
to
send
a
packet
from
a
source
system
to
a
destination
system.
Latency
is
a
measure
of
the
amount
of
time
required
to
send
a
packet
to
another
device
and
back
to
the
sending
device.
Latency
can
also
be
measured
in
terms
of
a
one-way
trip
from
a
source
to
the
destination,
but,
for
this
discussion,
latency
will
refer
to
round-trip
latency
unless
otherwise
noted.
Returning
to
the
highway
driving
analogy,
you
can
think
of
latency
as
the
time
required
to
make
a
round-trip
from
one
point
on
the
highway
to
another
point
and
then
return
to
the
starting
point.
Obviously,
the
speed
at
which
the
car
moves
will
determine
the
time
required
for
the
round-trip,
but
other
factors
can
influence
latency.
If
there
is
congestion
on
the
network,
as
with
highways,
latency
can
increase.
If
a
car
needs
to
leave
fast-moving
traffic
on
a
highway
to
transit
a
road
with
slower
maximum
speeds,
the
latency
will
increase
over
what
would
have
been
the
latency
had
the
car
stayed
on
the
highway.
This
situation
is
analogous
to
a
data
packet
that
is
transmitted
over
a
high-speed
network
but
then
routed
over
a
slower
network
en
route
to
its
destination
device.
Figure
3.2:
Ping
can
be
used
to
measure
latency
between
two
devices.
This
example
highlights
the
latency
of
packets
sent
from
the
west
coast
of
the
United
States
(Oregon)
to
a
server
on
the
east
coast
(Delaware),
averaging
87.1ms
and
ranging
from
79.68ms
to
129.24ms.
47
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Just
to
clarify,
latency
is
the
time
needed
to
send
a
packet
of
data
on
a
round-trip
between
two
networked
devices.
This
idea
should
not
be
confused
with
the
time
required
to
complete
a
transaction
on
a
Web
application.
Network
latency
is
one
factor
in
the
time
required
to
complete
a
Web
transaction;
the
time
required
to
complete
a
transaction
also
includes
the
time
needed
by
database
servers,
applications
servers,
and
other
components
in
the
application
stack.
In
addition
to
throughput
and
latency,
you
should
also
consider
the
rate
of
packet
loss
on
a
network.
Packet
Loss
Packet
loss
occurs
when
a
device
sends
a
packet
that
is
never
received
by
the
target
device.
This
loss
is
easily
detected
by
TCP.
Once
a
TCP
connection
is
established
between
two
devices,
the
devices
allocate
memory
as
a
buffer
to
receive
and
hold
data
packets.
The
buffer
is
needed
because
packets
may
arrive
out
of
order.
For
example,
packet
1
may
be
routed
over
a
network
that
suddenly
experiences
a
spike
in
congestion
while
packet
2
is
routed
over
another
network.
When
packet
2
arrives
at
the
destination,
it
is
held
in
the
buffer
until
packet
1
arrives.
(There
are
other
conditions
that
determine
how
the
buffer
is
used
and
when
it
is
cleared,
but
those
conditions
can
be
safely
ignored
in
this
illustration.)
If
packet
1
does
not
arrive
within
a
reasonable
period
of
time,
the
packet
is
considered
lost.
Lost
packets
have
to
be
retransmitted,
which
consumes
additional
network
bandwidth.
In
addition
to
transmitting
the
packet
twice,
there
is
additional
traffic
between
the
two
devices
to
coordinate
the
retransmission.
The
receiving
device
also
waits
the
period
of
time
specified
in
the
TCP
configuration
before
determining
a
packet
has
been
lost.
All
of
these
factors
combine
to
increase
the
time
needed
to
send
a
message
from
the
source
to
the
destination.
Packet
loss
is
one
of
the
factors
that
decreases
throughput.
Packet
loss
can
be
caused
by
signal
degradation
if
the
single
must
travel
long
distances
or
is
subject
to
physical
interference.
Congestion
on
a
network
device
can
also
cause
packet
loss.
Routers,
for
example,
have
only
so
much
memory
to
buffer
data
arriving
at
the
router.
When
the
traffic
arriving
at
the
device
exceeds
the
capacity
of
the
device
to
buffer
and
process
the
data,
packets
may
be
lost.
In
some
cases,
packet
loss
does
not
cause
retransmission
of
packets.
Some
applications
can
make
use
of
the
User
Datagram
Protocol
(UDP),
which
unlike
TCP,
does
not
guarantee
delivery
or
delivery
in
order.
This
setup
obviously
will
not
meet
the
needs
of
transaction
processing
systems,
but
there
are
use
cases
where
UDP
is
appropriate.
48
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
When
transmitted
data
has
a
limited
useful
life
and
is
replaced
with
newly
generated
data,
then
UDP
is
appropriate.
For
example,
if
you
were
to
synchronize
clocks
on
multiple
servers
by
sending
a
timestamp
from
a
time
server,
you
could
lose
a
single
timestamp
without
adversely
affecting
the
other
systems.
A
new
datagram
with
a
new
timestamp
would
be
generated
soon
after
the
first
datagram
was
lost.
Transmitting
video
or
audio
over
UDP
makes
sense
as
well.
If
many
packets
are
lost,
the
quality
of
the
image
or
sound
may
degrade
but
the
benefits
of
lower
overhead
compared
with
TCP
make
UDP
a
viable
alternative.
From
this
discussion,
you
can
see
that
there
are
at
least
three
measures
to
consider
when
analyzing
the
impact
of
the
network
on
Web
applications.
Throughput
is
a
measure
of
the
number
of
packets,
and
therefore
the
number
of
successfully
delivered
bits,
sent
between
devices.
Latency
is
a
measure
of
time
required
for
a
round-trip
transmission
between
devices,
while
packet
loss
is
a
measure
of
the
amount
of
data
that
is
lost
on
a
line.
Applications
that
use
TCP
(most
transactional
business
applications)
not
only
have
to
retransmit
lost
packets
but
also
incur
additional
overhead
to
coordinate
the
retransmission.
Next,
lets
consider
how
protocols,
especially
Hypertext
Transfer
Protocol
(HTTP)
and
TCP,
can
contribute
to
network
bottleneck
issues.
Protocol
Issues
The
Internet
utilizes
a
substantial
number
of
protocols
to
implement
the
many
services
we
all
use.
Some
of
these
operate
largely
behind
the
scenes
without
demanding
much
attention
from
software
developers
who
create
Web
applications.
The
Border
Gateway
Protocol
(BGP),
for
instance,
is
needed
to
route
traffic
between
independent
domains
and
to
exchange
information
about
network
routes.
The
Domain
Name
Service
(DNS)
is
probably
more
widely
recognized
as
the
service
that
maps
from
human
readable
domain
names
(for
example,
www.example.com)
to
IP
addresses.
As
you
consider
Web
application
performance
issues,
you
should
consider
how
these
services
can
adversely
impact
performance.
DNS,
for
example,
has
been
a
significant
contributor
to
adverse
performance.
However,
it
does
help
to
understand
some
of
the
implementation
details
around
HTTP
and
TCP.
The
way
you
use
HTTP
and
configure
TCP
can
have
noticeable
impact
on
Web
application
performance.
49
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
3.3:
The
original
Web
browser
running
on
the
NeXT
computer
at
CERN,
the
birthplace
of
the
Web
(Source:
http://info.cern.ch/).
In
many
ways,
the
fundamentals
of
transmitting
Web
content
has
not
changed
much
from
the
early
days
of
HTTP
adoption.
The
way
we
use
HTTP
and
generate
content
has
certainly
changed
with
new
development
technologies,
advances
in
HTML,
and
the
ability
to
deploy
functional
applications
in
a
browser.
The
fact
that
we
can
use
the
same
protocol
(with
some
modifications
over
the
years)
to
implement
todays
applications
attests
to
the
utility
of
HTTP.
There
are,
unfortunately,
limitations
of
HTTP
that
can
still
create
performance
problems
for
developers
and
Web
designers.
50
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
3.4:
Web
pages
and
applications
include
multiple
components,
ranging
from
text
and
images
to
application
services.
Each
requires
a
connection
to
download
the
component
from
its
server
to
the
client
device.
The
fact
that
multiple
connections
are
needed
to
download
all
the
components
of
a
Web
page
helps
to
draw
attention
to
the
process
of
establishing
a
connection
and
the
overhead
that
entails.
51
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
TCP
Handshake
Before
devices
can
exchange
packets
using
TCP,
the
two
devices
must
first
establish
a
connection.
This
arrangement
is
essentially
an
initial
agreement
to
exchange
data
and
an
acknowledgement
that
both
devices
are
reachable.
This
exchange
is
known
as
the
TCP
handshake.
Figure
3.5:
The
TCP
handshake
protocol
requires
an
initial
exchange
of
synchronizing
information
before
data
is
transmitted.
The
TCP
handshake
is
a
three-step
process
that
requires
the
exchange
of
packets
to
acknowledge
different
phases
of
the
process
known
as:
SYN
SYN-ACK
ACK
Each
of
these
phases
must
occur
before
the
next
phase
in
the
list
can
occur;
if
all
phases
complete
successfully,
a
transfer
of
data
can
begin.
52
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
TCP
handshake
begins
with
one
device
initiating
a
connection
to
another
device.
For
example,
when
a
customer
uses
a
browser
to
download
a
Web
page
from
a
business
Web
site,
a
connection
is
created
to
download
the
initial
HTML
for
the
page.
(There
will
probably
be
other
connections
as
well
but
this
discussion
will
describe
this
process
for
a
single
connection.)
The
first
step
of
the
process
entails
the
initiating
device
sending
a
SYN
packet
to
the
listening
device;
for
example,
a
laptop
sending
a
SYN
message
to
a
Web
server
to
download
a
Web
page.
Part
of
the
SYN
packet
contains
a
random
number
representing
a
sequence
number,
which
is
needed
in
later
steps
of
the
process.
After
the
SYN
message
is
sent,
the
initiator
is
said
to
be
in
SYN-SENT
state
and
it
is
waiting
for
a
reply.
Prior
to
receiving
the
SYN
packet,
the
listening
device
is
said
to
be
in
the
LISTEN
state.
Once
the
listening
devicefor
example,
the
serverreceives
the
SYN
packet,
it
responds
by
replying
with
a
SYN-ACK
packet.
This
packet
includes
the
sequence
number
sent
in
the
SYN
packet
incremented
by
1.
The
listening
device
generates
another
random
number
that
is
the
second
sequence
number,
which
is
included
in
the
SYN-ACK
packet.
The
listening
device
is
then
in
the
SYN-RECEIVED
state.
The
third
and
final
set
occurs
when
the
initiating
device
responds
to
the
SYN-ACK
packet
with
a
final
ACK
packet.
This
packet
includes
the
first
sequence
number
incremented
by
1
as
well
as
the
second
sequence
number
incremented
by
1.
After
this
point,
both
the
initiator
and
the
listening
devices
are
in
the
ESTABLISHED
state
and
data
transfer
can
begin.
The
connection
can
persist
as
long
as
needed.
When
the
connection
is
no
longer
needed,
a
3-step
termination
process
can
close
down
the
connection.
The
termination
process
uses
a
similar
3-step
process
with
FIN,
FIN-ACK,
and
ACK
packet
exchanges.
53
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
3.6:
TCP
packets
transmitted
in
the
same
connection
may
use
different
paths
to
reach
the
destination
device.
Sequence
numbers
in
each
packet
are
used
to
reconstruct
the
data
stream
in
the
same
order
it
was
sent.
Consider
a
situation
in
which
a
device
receives
a
series
of
packets
with
numbers
100,
101,
102,
and
104.
Packet
103
has
not
been
received.
This
situation
might
simply
be
a
case
of
the
situation
depicted
in
Figure
3.6.
Some
of
the
packets
used
a
path
that
allowed
for
faster
delivery
than
did
packet
103.
In
that
case,
it
might
be
just
a
matter
of
a
small
amount
of
time
before
packet
103
arrives.
When
all
the
packets
in
a
range
of
packets
are
received,
the
receiving
device
transmits
an
acknowledgment
to
the
sender.
Senders
can
use
these
acknowledgments
to
determine
whether
packets
are
lost.
TCP
requires
that
you
decide
how
long
you
are
willing
to
wait
for
a
packet
to
arrive.
When
packets
are
traveling
over
high-latency
networks,
such
as
satellite
networks,
you
might
want
to
have
long
waiting
periods.
There
would
be
no
advantage
to
asking
the
sending
device
to
retransmit
a
packet
that
has
not
had
sufficient
time
to
reach
the
destination
device.
In
fact,
that
could
create
additional,
unnecessary
traffic
that
could
consume
bandwidth
and
increase
congestion
on
the
network.
54
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
At
the
same
time,
you
do
not
want
to
wait
too
long
for
packets
to
arrive.
If
packets
are
dropped,
they
will
never
arrive.
For
example,
a
packet
may
have
been
dropped
at
a
router
that
was
experiencing
congestion
and
unable
to
process
all
the
traffic
arriving
at
the
router.
In
other
cases,
the
transmission
signal
may
have
degraded
to
the
point
that
the
packet
was
corrupted.
When
packets
are
actually
dropped,
it
is
better
to
infer
the
need
to
retransmit
sooner
rather
than
later.
The
destination
device
will
buffer
incoming
data
waiting
for
missing
packets
to
arrive.
Retransmitting
requires
sending
messages
from
the
destination
to
the
sender
and
then
from
the
sender
to
the
destination
device,
so
the
wait
for
the
retransmitted
packet
will
be
the
time
of
a
round-trip
between
the
two
devices.
This
discussion
has
delved
into
design
and
configuration
details
of
TCP,
but
it
is
important
to
remember
this
discussion
is
motivated
by
Web
application
performance.
TCP
design
issues
are
important
and
interesting,
but
unless
you
are
a
network
designer
or
computer
scientist
studying
protocols,
the
primary
reason
for
reviewing
TCP
is
to
help
improve
the
performance
of
your
Web
applications.
Lets
summarize
how
Web
application
performance
is
related
to
TCP.
Here
is
a
set
of
events
triggered
by
a
Web
application
user
requesting
a
Web
page:
The
Web
application
generates
Web
pages
that
are
downloaded
to
client
devices.
Web
pages
are
composed
of
multiple
elements,
including
layout
code,
scripts,
images,
audio
files,
etc.
Each
component
that
is
downloaded
separately
requires
a
TCP
connection.
TCP
connections
are
initiated
with
a
three-step
TCP
handshake
before
data
transmission
can
begin.
Once
data
transmission
begins,
the
connection
remains
open
until
a
termination
process
is
complete.
While
the
connection
between
two
devices
is
open,
TCP
packets
are
transmitted
between
the
devices.
TCP
buffers
incoming
packets
as
needed
to
ensure
a
data
stream
is
reconstructed
in
the
order
in
which
it
was
sent
Listening
devices
wait
for
packets
to
arrive
and
then
send
acknowledgments
to
the
sending
device.
If
the
sending
device
does
not
receive
an
acknowledgement
for
the
delivery
of
packets
within
a
predefined
time
period,
the
missing
packets
are
retransmitted.
55
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
You
can
see
from
this
list
of
events
that
a
single
Web
page
can
lead
to
the
need
for
multiple
connections,
that
multiple
connections
have
to
manage
multiple
packets,
and
that
packets
may
be
routed
over
slow,
noisy,
or
congested
networks.
Packets
routed
over
those
poor
quality
networks
can
be
lost
requiring
retransmission.
Therefore,
to
improve
the
performance
of
your
Web
application,
you
might
need
to
improve
the
performance
of
TCP
transmissions.
Consider
the
latency
of
an
application
hosted
in
North
America
and
serving
a
global
customer
base.
Within
North
America,
customers
might
experience
around
a
40ms
round-
trip
time.
The
same
type
of
user
in
Europe
might
experience
a
75ms
round-trip
time,
while
customers
in
East
Asia
might
experience
a
120ms
round-trip
time.
If
that
same
application
were
replicated
in
Europe
and
Asia,
customers
in
those
regions
could
expect
to
see
round-
trip
latency
to
drop
significantly.
Just
hosting
an
application
in
different
regions
will
not
necessarily
improve
application
performance.
Geographically
close
locations
can
still
contend
with
high
latencies
and
packet
loss
rates
if
the
quality
of
network
between
the
locations
is
poor.
56
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
There
are
several
characteristics
of
TCP
connections
that
can
be
configured
to
improve
performance
including:
Adjusting
the
size
of
buffers
receiving
dataA
64K
buffer
may
be
sufficient
for
some
applications
but
long
round-trip
times
may
warrant
larger
buffers.
Adjusting
parameters
that
determine
when
to
retransmit
a
packetTCP
has
a
parameter
that
determines
how
much
data
a
destination
device
will
receive
before
it
needs
to
acknowledge
those
packets.
If
this
parameter
is
set
too
low,
you
might
find
that
you
are
not
using
all
the
bandwidth
available
to
you
because
the
TCP
clients
are
waiting
for
an
acknowledgement
before
sending
more
data.
Using
selective
acknowledgementsTCP
was
originally
designed
to
use
a
cumulative
packet
loss
acknowledgement
scheme,
which
is
not
efficient
on
networks
with
high
packet
loss
rates.
It
can
limit
a
sender
to
receiving
information
about
a
single
packet
in
the
span
of
one
round-trip
time;
in
some
cases,
the
sender
may
retransmit
more
than
necessary
rather
than
wait
for
a
series
of
round-trip
time
periods
to
determine
whether
all
packets
were
received.
An
alternative
method,
known
as
selective
acknowledgments,
allows
a
receiving
device
to
acknowledge
all
received
packets
so
that
the
sender
can
determine
which
packets
need
to
be
resent.
The
TCP
stack
provided
with
some
operating
systems
(OSs)
might
implement
some
of
these
optimizations.
Specialized
tools
are
available
for
transmitting
large
files
over
specially
configured
connection
to
improve
performance.
Network
service
providers
may
also
implement
TCP
optimizations
between
their
data
centers
to
improve
performance.
57
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Web
application
performance
is
influenced
by
the
HTTP
and
TCP
protocols.
The
way
components
of
a
Web
page
are
downloaded,
the
overhead
of
establishing
connections,
and
the
way
packets
are
reliably
delivered
can
all
impact
Web
application
performance.
Developers,
designers,
and
system
architects
do
have
options
though.
Application
and
data
replication
can
overcome
the
inherent
limitations
of
long
distance
transmissions.
Connection
pooling
and
other
device-specific
optimizations
can
help
reduce
overhead
at
the
endpoints.
Even
TCP
can
be
optimized
to
improve
network
performance.
Now
that
we
have
covered
some
of
the
lower
level
details
about
how
the
Internet
can
impact
Web
application
performance,
it
is
time
to
turn
our
attention
to
higher-level
constructs:
network
providers.
Figure
3.7:
Peering
creates
a
set
of
relationships
between
ISPs.
These
relationships
determine
at
a
high
level
how
data
is
transmitted
across
the
Internet
(Source:
By
User:Ludovic.ferre
(Internet
Connectivity
Distribution&Core.svg)
[CC-BY-SA-3.0
(http://creativecommons.org/licenses/by-sa/3.0)],
via
Wikimedia
Commons).
58
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
59
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Congestion
Traffic
congestion
is
a
risk
in
any
network.
When
a
large
number
of
networks
come
together
in
a
single
location,
that
risk
grows
with
the
amount
of
traffic
relative
to
the
infrastructure
in
place
to
receive
that
traffic.
To
avoid
the
risk
of
congestion,
you
could
reduce
the
amount
of
traffic
on
the
Internet,
but
that
is
unlikely.
In
fact,
you
are
more
likely
to
face
contention
for
available
bandwidth
as
more
and
more
devices
become
network-
enabled.
(See,
for
example,
the
McKinsey
&
Company
report
entitled
The
Internet
of
Things.)
Once
again,
you
encounter
a
situation
that
you
cannot
eliminate
but
you
might
be
able
to
avoid.
By
replicating
data
and
applications,
you
can
avoid
having
to
transmit
data
over
congested
paths
on
the
Internet.
When
replication
is
not
an
option,
and
even
in
some
cases
where
it
is,
you
can
still
improve
performance
using
various
TCP
optimization
and
end
point
optimizations
such
as
connection
pooling.
Summary
In
spite
of
all
the
time
and
effort
you
might
put
into
tuning
your
Web
application,
the
Internet
can
become
a
bottleneck
for
your
Web
application
performance.
Both
the
HTTP
and
TCP
protocols
have
an
impact
on
throughput.
Network
infrastructure
and
configuration
dictates
latency.
Aspects
such
as
noise
and
congestion
factor
into
packet
loss
rates.
These
are
difficult
issues
to
contend
with
especially
when
you
deploy
an
application
used
on
a
global
scale.
Fortunately,
there
are
strategies
for
dealing
with
these
performance
issues,
which
the
next
chapter
will
address.
60
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Application
maintenance
Data
loss
Network
disruption
Disruption
in
environmental
controls
in
data
centers
61
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Hosting
your
data
and
application
in
multiple
data
centers
can
help
mitigate
the
impact
of
each
of
these
events
(see
Figure
4.1).
Figure
4.1:
The
implementation
of
multiple
data
centers
offers
a
number
of
advantages,
including
resiliency
to
several
potential
problems
(Image
source:
CDNetworks).
Application
Maintenance
Applications
have
to
be
shut
down
for
maintenance,
sometimes
due
to
needed
upgrades
or
patches
to
an
operating
system
(OS)
or
application
code.
In
other
cases,
equipment
in
the
data
center
needs
to
be
replaced
or
reconfigured,
making
them
unavailable
to
support
your
enterprise
application.
If
the
application
and
data
are
replicated
to
another
data
center,
user
traffic
can
be
routed
to
the
alternative
data
center,
allowing
users
to
continue
working
with
the
system.
This
type
of
replication
can
also
be
done
locally,
within
a
data
center.
Failover
servers
and
redundant
storage
arrays
can
allow
for
resiliency
within
the
application.
The
most
obvious
risk
with
this
approach,
however,
is
that
a
data
centerwide
disruptive
event
would
render
the
failover
servers
and
storage
inaccessible.
Data
Loss
Data
loss
can
occur
as
a
result
of
hardware
failures,
software
errors,
and
user
mistakes.
Depending
on
the
type
of
failure,
having
data
and
applications
replicated
in
multiple
data
centers
can
aid
in
recovery.
62
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
If
the
data
had
been
replicated
in
at
least
one
other
data
center,
the
data
could
be
recovered
and
restored
in
the
data
center
that
is
experiencing
a
failure.
This
scenario
would
be
comparable
to
restoring
data
from
a
backup.
There
would,
of
course,
be
a
time
delay
between
the
time
of
the
failure
and
the
time
that
data
is
restored.
In
cases
where
the
time
between
failure
and
restoration
must
be
as
short
as
possible,
application
designers
can
replicate
both
data
and
applications
between
data
centers.
In
this
circumstance,
in
the
event
of
a
hardware
failure,
users
would
be
routed
to
an
application
running
in
the
alternative
data
center.
Users
might
experience
degraded
performance
due
to
an
increased
number
of
users
on
the
application
servers;
they
might
also
face
longer
network
latency
if
the
alternative
data
center
is
significantly
farther
away
then
the
location
of
the
failed
hardware
data
center.
Software
errors
can
corrupt
data
in
many
ways.
A
database
operation
designed
to
update
a
particular
set
of
records
might
unintentionally
update
more
than
the
target
data
set.
An
error
in
writing
data
to
disk
can
cause
a
pointer
to
a
data
structure
to
be
lost,
leading
to
unrecoverable
data.
A
miscalculation
could
write
the
wrong
data
to
a
database.
In
each
of
these
cases,
a
data
loss
occurs
and,
unlike
hardware-related
data
loss,
the
same
type
of
error
is
likely
to
occur
in
replicated
instances
as
well.
63
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Application
designers
can
plan
for
many
types
of
potential
human
errors,
from
invalid
data
entry
to
changes
that
violate
business
rules.
It
is
difficult
to
distinguish
an
intentional
change
from
an
unintentional
change
when
data
validation
rules,
business
rules,
or
other
criteria
for
assessing
users
actions
are
not
violated.
Changes
made
by
users
that
do
not
violate
application
rules
are
considered
valid;
by
default,
they
will
be
accepted
and
eventually
replicated
to
other
instances
of
the
data
stores.
There
are
advantages
to
having
multiple
data
centers
to
mitigate
the
risk
of
data
loss.
However,
it
is
important
to
understand
the
limitations
of
this
strategys
ability
to
recover
from
different
causes
of
data
loss.
Network
Disruption
Network
disruption
at
a
data
center
level
can
adversely
impact
a
large
number
of
users.
The
importance
of
reliable
access
to
the
Internet
for
a
data
center
cannot
be
overstated.
Data
centers
typically
contract
with
multiple
Internet
providers
for
access
to
the
Internet.
If
one
of
the
providers
experiences
a
network
disruption,
traffic
can
be
routed
over
the
other
providers
connections.
The
assumption,
of
course,
is
that
the
redundant
services
of
multiple
providers
will
not
fail
at
the
same
time.
This
assumption
is
reasonable
for
the
most
part
except
when
you
consider
major
disruptions
due
to
natural
disasters.
Severe
storms
that
disrupt
power
for
days
or
earthquakes
that
damage
cables
can
leave
entire
data
centers
disconnected
from
the
Internet
for
extended
periods.
Avoiding
areas
prone
to
natural
disasters
can
be
a
challenge.
As
Figure
4.2
shows,
areas
of
high
risk
for
seismic
activity
exist
in
the
United
States
on
the
West
coast,
in
the
Midwest,
and
in
small
areas
of
the
Southeast.
The
Midwest
and
Gulf
Coast
are,
in
general,
at
low
risk
of
seismic
activity
but
are
prone
to
tornadoes
and
hurricanes,
respectively.
For
this
reason,
using
multiple
data
centers
located
in
areas
with
different
risk
profiles
is
a
reasonable
approach
to
mitigating
the
risk
of
data
centerlevel
network
disruptions.
64
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.2:
Seismic
hazard
map
of
the
United
States
indicates
locations
of
highest
risk
on
the
West
Coast
and
parts
of
the
Midwest
and
the
Southeast
(Source:
Earthquake.usgs.gov).
Redundant
data
centers
can
help
mitigate
several
risks
of
service
disruption,
ranging
from
application
maintenance
and
software
failure
to
natural
disasters
that
destroy
data
centers
and
loss
of
environment
controls
that
diminish
operating
capacity.
The
advantages
of
multiple
data
centers
are
not
limited
to
minimizing
the
impact
of
disruptions.
65
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Reduced
Latency
Latency,
or
the
round-trip
time
from
a
server
to
a
client
device
and
back
to
the
server,
can
vary
significantly
for
different
users.
As
Figure
4.3
shows,
a
customer
in
the
Pacific
Northwest
accessing
an
application
hosted
in
New
York
will
experience
latencies
almost
three
times
that
of
a
customer
in
Florida.
One
way
to
address
these
wide
variations
in
latency
is
to
deploy
applications
to
multiple
data
centers
and
architect
applications
to
serve
customers
from
the
closest
data
center.
Figure
4.3:
Latencies
within
a
country
can
be
substantially
different
for
customers
in
different
locations
when
those
customers
are
served
from
a
single
data
center
(Image
source:
CDNetworks).
Clearly
there
are
advantages
to
having
multiple
data
centers
host
your
applications.
Multiple
data
centers
provide
for
redundancy
and
improve
the
resiliency
of
applications.
Hardware
failures,
network
disruptions,
hardware-based
data
loss,
and
risks
of
natural
disasters
can
all
be
addressed
to
some
degree
with
multiple
data
centers.
It
would
seem
obvious
that
we
should
all
deploy
large-scale,
mission-critical
applications
to
multiple
data
centers,
but
there
are
drawbacks.
It
is
often
said
that
there
are
no
free
lunches
in
economics.
Similarly,
there
are
no
free
solutions
to
IT.
66
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.4:
Data
centers,
such
as
this
one
housed
in
Oregon
in
the
US,
are
major
investments
to
both
build
and
operate
(Source:
By
Tom
Raftery
(Flickr)
[CC-BY-SA-
2.0
(http://creativecommons.org/licenses/by-sa/2.0),
via
Wikimedia
Commons).
67
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Operating
costs
in
a
typical
data
center
are
dominated
by
the
costs
of
servers
and
power.
Networking
equipment,
power
distribution
systems,
cooling
systems,
and
other
infrastructure
are
also
major
cost
categories.
These
costs
have
to
be
weighed
against
the
benefits
of
deploying
multiple
data
centers,
which
are,
as
previously
described,
are
substantial.
There
are
alternative
ways
to
achieve
some
of
the
same
benefits,
especially
with
regards
to
content
delivery,
without
incurring
the
substantial
step-wise
costs
of
adding
data
centers.
Software
Errors
Complex
systems,
such
as
data
centers
and
cloud
infrastructures,
are
subject
to
failure.
Even
when
systems
are
designed
for
resiliency,
they
can
experience
problems
due
to
software
errors
or
unanticipated
cascading
effects
that
propagate
through
a
complex
system.
Consider
some
of
the
major
cloud
service
outages
in
the
past
several
years
as
examples
of
what
can
go
wrong:
68
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Amazon,
Google,
and
Microsoft
are
all
major
cloud
providers
with
sufficient
resources
to
deploy
and
manage
multiple
data
centers.
Yet
even
these
major
providers
experience
substantial
disruptions
due
to
software
errors.
Synchronization
Issues
When
data
has
to
be
available
in
multiple
data
centers,
you
have
to
contend
with
synchronization
issues,
including
both
technical
and
cost
considerations.
Synchronizing
large
volumes
of
data
can
consume
substantial
amounts
of
bandwidth.
Like
the
data
sent
back
and
forth
to
end
users
of
applications,
data
sent
to
other
data
centers
is
subject
to
network
congestion,
long
latencies,
and
lost
packets.
Sub-optimal
network
conditions
can
lead
to
extended
synchronization
times.
In
a
worst-case
scenario,
delays
in
synchronizing
data
can
adversely
affect
application
performance.
For
example,
a
server
with
stale
data
could
report
inaccurate
information
to
a
user,
while
another
user
issuing
the
same
query
but
receiving
data
from
a
server
in
a
different
data
center
might
get
the
correct,
up-to-date
information.
Perhaps
one
of
the
most
significant
unaddressed
challenges
of
deploying
multiple
data
centers
is
the
fact
that
traffic
might
still
be
using
non-optimized
TCP
configurations.
These
unaddressed
needs
and
disadvantages
to
multiple
data
centers
are
not
presented
as
deterrents
from
employing
multiple
data
centers.
Instead,
the
point
is
made
to
use
a
more
balanced
approach
that
combines
the
benefits
of
multiple
data
centers
with
the
benefits
of
a
content
delivery
network
that
maximizes
the
pros
of
the
two
while
limiting
the
costs
and
disadvantages
of
each.
69
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Multiple
data
centers
provide
redundancy
but
at
a
substantial
cost
and
increased
complexity.
Nonetheless,
having
at
least
two
data
centers
is
difficult
to
avoid
if
you
are
looking
to
maintain
application
availability
in
the
event
of
catastrophic
failure
at
a
single
data
center.
One
can
reasonably
ask
whether
one
backup
data
center
is
enough.
There
could
conceivably
be
catastrophic
failures
at
two
data
centers
or
a
catastrophic
failure
at
one
and
a
less
significant
but
still
performance-degrading
event
at
the
other
data
center.
The
right
solution
depends
on
your
requirements
and
tolerance
for
risk.
If
your
risk
profile
allows
for
two
data
centers
rather
than
more,
you
can
reduce
the
overall
capital
and
operational
expenses
for
data
centers.
Two
data
centers
employing
a
geo-load-
balancing
method
can
share
the
application
traffic
between
the
two
data
centers.
This
setup
will
likely
reduce
the
latency
for
some
users
in
close
proximity
to
the
data
center
but
in
general,
data
centers
do
not
solve
the
latency
problem.
As
this
scenario
considers
only
two
data
centers,
the
number
of
users
benefiting
from
this
arrangement
is
limited.
70
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.5:
Multiple
data
centers
increases
redundancy
but
at
additional
cost
and
management
complexity.
If
you
combine
two
data
centers
with
caching
of
static
content
along
with
dynamic
application
traffic
optimization
techniques
focused
on
TCP
and
HTTP,
you
can
improve
end
user
experience
by
enabling
lower
latency
along
with
improved
availability.
71
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.6:
The
middle
mile
between
data
centers
and
the
edge
of
the
last
mile
can
be
optimized
to
improve
TCP
performance.
The
first
mile
is
the
segment
originating
at
the
data
center
while
the
last
mile
is
the
segment
ending
at
the
end
user.
The
distance
that
packets
must
travel
in
the
middle
mile
can
be
significant.
Anything
that
can
reduce
the
number
of
packets
sent
can
help
improve
performance.
These
techniques
include:
72
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.7:
Static
content
is
cached
at
data
centers.
When
a
user
requests
content,
it
is
served
from
the
closest
CDN
data
center.
If
the
content
is
not
currently
in
the
cache,
it
is
requested
from
the
origin
data
center.
Data
compression
can
help
reduce
the
number
of
packets
that
must
be
sent
between
the
data
center
and
client
devices
by
reducing
the
size
of
the
payload.
TCP
breaks
a
stream
of
data
into
individual
units
of
length
that
depend
on
the
medium;
for
example,
Ethernet
packets
can
be
as
long
as
1500
bytes
in
length,
including
header
information.
The
media
defines
the
maximum
transmission
unit
on
a
network,
so
a
valuable
technique
is
to
compress
the
payload
data
before
it
is
transmitted.
TCP
has
a
number
of
configuration
parameters
that
allow
for
tuning.
Characteristics
such
as
buffer
size
and
the
settings
controlling
the
retransmission
of
lost
packets
can
be
adjusted
to
maximize
throughput.
For
example,
one
tuning
technique
employs
a
more
efficient
way
of
detecting
a
lost
packet
and
reduces
the
number
of
packets
that
might
be
retransmitted
unnecessarily.
These
types
of
techniques
are
especially
important
when
optimizing
non-static,
application
traffic.
Static
pages
can
be
cached
by
content
delivery
networks
and
served
to
users
within
their
region.
Application-generated
content,
such
as
responses
to
database
queries,
search
results,
or
reporting
and
analysis
tool
output,
will
vary
from
one
user
to
another.
73
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Caching
Acceleration
techniques
reduce
latency
and
packet
loss
between
data
centers,
but
there
are
additional
benefits
of
caching.
Figure
4.8:
Dynamic
content
cannot
be
cached
and
must
be
sourced
from
the
data
center
hosting
the
application
generating
the
content.
Acceleration
improves
throughput
and
allows
for
a
more
responsive
application
experience.
Here,
the
user
has
accelerated
access
to
dynamic
content
from
a
distant
data
center
and
fast
access
to
static
content
from
a
closer
data
center.
Load
Balancing
By
using
a
content
delivery
network,
the
workload
for
serving
content
is
distributed
to
multiple
points
of
presence
around
the
globe.
Users
in
different
parts
of
the
world
viewing
the
same
Web
page
will
see
the
same
content
even
though
that
content
is
served
from
different
locations.
This
load
balancing
is
known
as
geo-load
balancing
and
is
one
type
of
load
balancing
supported
by
content
delivery
network
providers.
74
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
In
addition
to
geo-load
balancing,
content
delivery
network
providers
can
load
balance
within
a
point
of
presence.
Clusters
of
servers
can
be
deployed
to
serve
static
content
so
that
a
single
Web
server
does
not
become
a
bottleneck.
It
would
be
unfortunate
if
after
globally
distributing
your
content
and
deploying
TCP
optimizations
and
other
acceleration
techniques,
a
single
Web
server
slows
the
overall
throughput
of
your
application.
Redundant
data
centers
provide
for
reliable
access
to
content
in
the
event
of
a
failure
at
another
data
center
Acceleration
techniques
reduce
latency
and
packet
loss
resulting
in
more
responsive
applications
from
the
end
users
perspective
Acceleration
techniques
reduce
the
time
required
to
keep
static
content
synchronized
across
data
centers
Content
delivery
networks
maintain
copies
of
static
content
in
multiple
data
centers,
reducing
the
distance
between
end
users
and
content
Content
delivery
providers
can
offer
load
balanced
and
fault
tolerant
clusters
of
servers
Content
delivery
providers
can
monitor
server
status
and
the
load
on
systems
and
adjust
resources
as
needed
to
maintain
acceptable
performance
levels
75
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
These
benefits
apply
to
most
content
delivery
use
cases
but
they
do
not
capture
all
the
challenges
an
organization
faces
when
distributing
content
globally.
China,
in
particular,
presents
additional
requirements
that
are
worth
considering
as
you
evaluate
content
delivery
and
application
delivery
acceleration
providers.
Figure
4.9:
China
presents
many
business
opportunities
but
regulation
and
cultural
differences
need
to
be
considered
(Source:
By
Cacahuate,
amendments
by
Peter
Fitzgerald
and
ClausHansen
(Own
work
based
on
the
map
of
China
by
PhiLiP)
[CC-
BY-SA-3.0-2.5-2.0-1.0
(http://creativecommons.org/licenses/by-sa/3.0)],
via
Wikimedia
Commons).
76
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
As
with
Internet
service
in
any
part
of
the
globe,
peering
relationships
can
substantially
affect
latency
and
packet
loss.
ISPs
in
China
may
have
peering
agreements
with
other
ISPs
that
lead
to
less
than
optimal
routing
of
network
traffic.
This
can
lead
to
longer
latencies
and
increased
packet
loss
due
to
congestion.
Due
to
issues
with
peering,
businesses
may
experience
less
than
100%
availability
of
network
devices.
China
India
Japan
Indonesia
South
Korea
Millions
of
Internet
Users
Philippines
Vietnam
Pakistan
Thailand
Malaysia
One
way
to
address
these
technical
issues
is
to
have
multiple
points
of
presence
in
China.
This
setup
allows
for
static
content
caching
to
more
sites
and
therefore
closer
to
more
users.
Having
multiple
points
of
presence
can
help
reduce
the
number
of
networks
that
must
be
traversed
to
deliver
content,
and
thus
avoid
some
of
the
negative
consequences
of
poor
peering.
In
addition
to
these
technical
challenges,
there
are
regulatory
and
legal
issues
that
must
be
considered
when
delivering
content
in
China.
77
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
4.11:
Web
content
and
sites
are
regulated
in
China
and
sites
readily
available
to
users
outside
of
China
are
inaccessible
within
that
country
(Source:
GreatFirewallofChina.org).
The
Great
Firewall
of
China
has
two
implications
for
delivering
content
in
China:
technical
and
legal.
Downloading
content
within
the
Great
Firewall
of
China
can
be
significantly
slower
than
doing
so
outside
the
firewall.
In
addition
to
long
distances
and
poor
peering,
businesses
need
to
consider
the
impact
of
censoring
technologies
on
network
performance.
78
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Summary
Multiple
data
centers,
content
delivery
networks,
and
application
delivery
networks
with
network
acceleration
can
improve
application
performance
and
responsiveness
from
an
end
users
perspective.
Data
centers
provide
essential
redundancy
needed
for
reliable
access
to
applications
and
content.
They
are,
however,
costly
and
complex
to
operate.
By
combining
a
small
number
of
data
centers
with
application
and
network
acceleration
technologies,
businesses
can
realize
improved
application
performance
without
the
cost
of
additional
data
centers.
Delivering
content
to
global
markets
requires
attention
to
a
variety
of
national
regulations
and
cultural
expectations.
China
presents
opportunities
for
business
but
requires
compliance
with
laws
governing
the
types
of
content
that
should
be
available
to
Internet
users
within
China.
79
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Before
diving
into
a
detailed
description
of
service
and
deployment
models,
lets
first
define
essential
characteristics
of
all
clouds.
80
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
On-demand
service
Broad
network
access
Resource
pooling
Rapid
elasticity
Measured
service
(For
full
details
on
the
definition,
see
The
NIST
Definition
of
Cloud
Computing:
Recommendations
of
the
National
Institute
of
Standards
and
Technology.
NIST
Special
Publication
800-145.)
Network
performance
is
outside
the
scope
of
the
cloud
computing
definition,
but
it
is
still
an
important
element
of
overall
application
performance.
Already
in
this
discussion,
it
is
becoming
apparent
that
cloud
computing
alone
does
not
address
all
the
requirements
of
globally
deployed,
enterprise
applications.
Latency
is
a
dominant
problem
in
Web
application
performance.
Cloud
applications
run
in
data
centers
that
might
be
a
long
distance
from
end
users,
which
introduces
distance-induced
latencies.
There
are
other
potential
performance-degrading
problems
as
well.
For
example,
peering
agreements
between
ISPs
may
be
structured
in
ways
that
lead
to
degraded
performance
when
data
is
routed
over
networks
controlled
by
other
ISPs.
On-Demand
Service
Cloud
users
have
the
ability
to
provision
and
use
virtualized
resources
when
they
are
needed
for
as
long
as
they
are
needed.
Specialized
systems
administration
skills
are
not
required.
Users
work
with
a
cloud
computing
dashboard
to
select
the
types
and
number
of
virtualized
servers
needed.
For
example,
if
a
group
of
analysts
needs
a
server
to
analyze
a
large
data
set,
they
would
log
into
a
dashboard
or
control
panel,
select
an
appropriate
type
of
virtual
server,
identify
the
machine
image
with
the
necessary
operating
system
(OS)
and
analysis
software,
and
launch
the
virtual
machine.
It
is
important
to
emphasize
the
self-service
nature
of
this
process.
Virtualization
has
long
been
used
in
data
centers
to
improve
the
efficiency
of
server
utilization,
but
prior
to
cloud
computing,
setting
up
a
virtualized
server
required
significant
knowledge
about
OSs,
hypervisors,
and
system
configurations.
Cloud
computing
platforms
automate
many
of
the
steps
required
to
instantiate
a
virtual
machine.
81
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Resource
Pooling
Cloud
computing
customers
share
infrastructure.
When
a
user
starts
a
virtual
machine,
it
is
instantiated
on
a
physical
server
in
one
of
the
cloud
providers
data
centers.
The
customer
may
be
able
to
choose
the
region
or
data
center
in
which
the
virtual
machine
runs
but
does
not
choose
the
physical
server
itself.
In
all
likelihood,
the
virtual
machines
running
on
a
single
server
belong
to
several
customers.
Similarly,
one
customers
blocks
of
storage
in
a
storage
array
may
be
intermingled
with
other
customers
data.
The
cloud
computing
platform
and
hypervisors
are
responsible
for
isolating
resources
so
that
they
are
accessible
only
to
resource
owners
and
others
explicitly
granted
access
to
the
resources.
With
secure
resource
pooling,
cloud
providers
can
optimize
the
efficiency
of
server
utilization
by
pooling
the
resource
requirements
of
large
numbers
of
customers.
Rapid
Elasticity
In
a
cloud,
it
is
relatively
easy
to
instantiate
a
large
number
of
virtual
servers.
For
example,
the
group
of
analysts
working
on
a
large
data
set
may
determine
that
the
best
way
to
analyze
the
data
is
to
use
a
cluster
of
servers
working
in
parallel.
Launching
20
servers
is
not
much
more
difficult
than
launching
one.
(Coordinating
the
work
of
the
20
servers
is
a
more
complex
problem,
but
there
are
applications
for
managing
distributed
workloads
that
can
be
readily
deployed
in
a
cloud.)
Cloud
providers
also
offer
services
to
monitor
loads
on
servers
and
bring
additional
servers
online
as
needed.
For
example,
if
there
is
a
spike
in
demand
for
a
Web
application,
the
cloud
management
platform
can
detect
the
increased
demand,
bring
additional
servers
online,
and
add
them
to
a
load-balanced
group
of
servers.
Those
servers
can
be
shut
down
automatically
when
demand
drops.
The
ability
to
rapidly
add
servers
or
storage
can
help
address
some
aspects
of
peak
demand,
but
this
type
of
rapid
elasticity
does
not
address
network-related
performance
issues.
Consider
the
following
simple
example.
82
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
A
Web
application
running
in
a
cloud
data
center
in
the
eastern
United
States
is
experiencing
higher
than
usual
demand
from
users
in
Europe
and
Asia.
The
load-
monitoring
system
detects
an
increase
in
workload
and
brings
an
additional
virtual
server
online.
The
set
of
application
servers
can
now
process
a
larger
number
of
transactions
in
a
given
amount
of
time.
This
setup
does
not,
however,
affect
the
time
required
to
transmit
data
between
the
servers
and
the
client
devices.
The
latency
of
the
network
between
the
data
center
in
the
US
and
the
client
devices
in
Europe
and
Asia
is
not
altered
by
changes
in
the
application
server
cluster
running
in
the
cloud.
Measured
Service
Cloud
providers
use
a
pay
as
you
go
or
pay
for
what
you
use
cost
model.
This
setup
fits
well
with
the
rapid
elasticity
and
pooled
resource
aspects
of
cloud
computing.
Customers
do
not
have
sole
use
of
dedicated
hardware,
so
it
is
important
to
charge
according
to
the
share
of
resources
used.
Cloud
providers
typically
use
tiered
pricing
based
on
the
size
or
quality
of
a
service.
For
example,
a
high-memory
virtual
machine
will
cost
more
than
a
low-
memory
virtual
machine.
Similarly,
charges
for
higher-performance
solid
state
drives
will
be
higher
than
those
for
commodity
disk
storage.
As
you
can
see
from
this
description,
the
essential
characteristics
of
cloud
computing
are
insufficient
to
address
the
full
range
of
cloud
optimization
requirements.
There
are
different
deployment
and
service
models,
and
it
is
worth
considering
how
these
may
impact
cloud
optimization
issues.
83
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
5.1:
The
essential
characteristics
of
cloud
computing
address
many
areas
relevant
to
application
performance,
but
key
considerations,
such
as
latency
and
packet
loss,
are
outside
the
scope
of
cloud
computing
fundamentals.
Public
Private
Hybrid
Community
Public
clouds
are
open
for
use
to
the
general
public.
Generally,
anyone
with
a
credit
card
and
access
to
the
Internet
can
make
use
of
public
cloud
resources.
Public
clouds
are
the
least
restrictive
of
the
four
deployment
models
with
regards
to
who
is
granted
access.
84
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Private
clouds
are
at
the
other
end
of
the
access
spectrum.
This
type
is
the
one
of
the
most
restrictive
cloud
deployment
models.
Access
to
a
private
cloud
is
restricted
to
members
of
a
single
organization,
such
as
a
business
or
government
entity.
Hybrid
clouds
are
clouds
with
two
or
more
clouds
that
are
linked
in
such
a
way
to
allow
for
portability
of
data
and
applications
between
the
two
clouds.
This
definition
allows
for
different
combinations
of
cloud
types
(for
example,
two
private
clouds,
a
private
and
community
cloud,
and
so
on)
but
a
private
cloud
and
a
public
cloud
is
most
typical.
Community
clouds
are
designed
to
serve
a
group
of
users
from
multiple
organizations
who
share
common
requirements.
For
example,
a
community
cloud
could
provide
specialized
HIPAA-compliant
services
to
healthcare
providers.
Only
members
of
the
specialized
community
are
granted
access
to
these
clouds.
Deployment
models
are
important
considerations
for
those
concerned
with
security
and
compliance
issues.
These
models
do
not
directly
affect
application
performance
because
the
same
cloud
infrastructure
and
management
platforms
can
be
deployed
in
any
of
these
models.
Deployment
models
may
indirectly
affect
performance
in
cases
in
which
you
want
to
leverage
the
benefits
of
content
delivery
services
or
network
optimization
services.
Consider
an
example
of
a
hybrid
cloud
based
on
one
private
cloud
and
one
public
cloud.
The
public
cloud
may
offer
a
proprietary
content
delivery
network
that
works
with
content
stored
in
the
public
cloud.
Content
maintained
in
the
private
cloud
may
have
to
be
replicated
to
the
public
cloud
before
it
can
be
served
through
the
content
delivery
network.
In
addition
to
considering
deployment
models,
it
is
important
to
consider
how
different
service
models
may
impact
overall
application
performance
and
our
ability
to
optimize
application
and
network
services.
85
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Infrastructure
as
a
Service
IaaS
models
allow
users
the
greatest
level
of
control
over
the
provisioned
infrastructure.
IaaS
customers,
for
example,
can
choose
the
size
of
the
virtual
server
they
run,
the
OS,
software
libraries
and
applications,
as
well
as
various
types
of
storage.
Users
control
the
OS,
so
they
have
substantial
control
over
the
application
platform
and
system
configuration.
The
owner
of
a
provisioned
server
could:
Figure
5.2:
IaaS
allows
for
high
levels
of
control
but
entails
high
levels
of
responsibility
on
the
part
of
cloud
users.
86
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Platform
as
a
Service
In
a
PaaS
cloud,
customers
do
not
have
substantial
control
over
virtual
servers,
OSs,
or
application
stacks.
Instead,
the
PaaS
provider
supports
programming
languages,
libraries
and
application
services
used
by
developers
to
create
programs
that
run
on
the
PaaS
platform.
PaaS
providers
offer
different
types
of
services.
Some
specialize
in
a
single
language
or
language
family,
such
as
Java
and
languages
that
run
on
the
Java
virtual
machine
(JVM).
Others
tend
to
be
language
agnostic
but
offer
a
variety
of
frameworks
and
data
stores
that
can
be
combined
to
meet
the
particular
needs
of
each
customer.
An
advantage
of
PaaS
over
IaaS
is
that
the
PaaS
cloud
provider
is
responsible
for
managing
more
components
in
the
application
stack
and
infrastructure.
As
with
IaaS
clouds,
the
PaaS
cloud
provider
manages
underlying
hardware
and
network
infrastructure
on
behalf
of
customers.
PaaS
providers
manage
additional
software
components
as
well.
Figure
5.3:
PaaS
models
build
on
the
same
kind
of
infrastructure
provided
by
an
IaaS
setup
but
alleviate
some
of
the
management
responsibility
of
an
IaaS.
87
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
tradeoff
for
developers
is
that
they
are
more
constrained
in
their
choices
in
a
PaaS.
For
example,
a
PaaS
cloud
provider
may
implement
a
message
queue
service
for
its
cloud.
If
developers
need
a
message
queue,
they
will
have
access
to
the
PaaS
providers
chosen
tool.
Developers
working
an
IaaS
cloud
could
choose
from
a
number
of
message
queue
systems
and
manage
it
themselves.
Extending
this
model
of
increasing
levels
of
service
and
decreasing
levels
of
control
brings
us
to
the
SaaS
model.
Software
as
a
Service
SaaS
providers
are
the
most
specialized
of
cloud
providers.
Rather
than
focus
on
providing
access
to
virtualized
servers
and
storage
or
offering
developers
a
managed
application
stack,
SaaS
providers
offer
full
applications.
Common
SaaS
use
cases
include:
The
disadvantage
of
SaaS
is
that
customers
have
the
least
control
over
the
service.
For
example,
a
SaaS
customer
cannot
generally
dictate
the
type
of
data
store
used
to
persistently
store
application
data.
The
SaaS
provider
makes
such
design
decisions
and
all
customers
use
a
common
application
platform.
Since
SaaS
providers
have
control
over
the
underlying
architecture
of
the
system,
it
is
important
for
customers
to
understand
how
the
SaaS
providers
design
choices
affect
data
integration,
data
protection,
and
other
aspects
of
application
management.
For
example,
data
from
a
SaaS
financial
management
system
may
be
needed
in
an
independent
management
reporting
system.
Users
may
be
able
to
perform
bulk
exports
or
query
smaller
subsets
of
data
through
an
application
programming
interface
(API).
Some
customers
may
need
to
maintain
their
own
backups
of
data
from
the
SaaS
application.
In
this
case,
the
customer
will
need
to
define
and
implement
an
export
process
in
order
to
maintain
up-to-
date
copies
of
data
on-premise.
Customers
would
also
need
to
understand
the
data
model
used
by
SaaS
to
extract
data
for
use
in
other
applications.
88
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
5.4:
SaaS
provides
turn-key
services
with
the
lowest
levels
of
systems
management
responsibility
but
limited
ability
to
customize
the
implementation
of
the
service.
The
essential
characteristics
of
cloud
computing
along
with
the
deployment
models
and
service
models
give
a
broad,
structured
view
of
cloud
computing
services.
From
this
perspective,
you
can
see
that
cloud
providers
virtualize,
some
but
not
all,
the
key
components
of
enterprise-scale
applications.
In
particular,
cloud
service
providers
virtualize:
Servers
Storage
Platform
and
software
services
The
servers
may
be
substantially
under
the
control
of
cloud
users,
as
in
the
case
of
an
IaaS
cloud;
abstracted
to
logical
computing
units,
as
in
some
PaaS
providers;
or
essentially
hidden
from
users,
as
is
the
case
with
SaaS.
89
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Storage
is
also
virtualized.
IaaS
cloud
providers
offer
several
types
of
persistent
storage,
including
object
storage,
file
system
storage,
and
database
storage
systems.
Services,
such
as
message
passing,
search
over
unstructured
data,
and
specialized
data
services
are
also
virtualized.
As
one
moves
into
PaaS
and
SaaS
service
models,
higher-level
services
are
provided.
This
movement
from
lower-level
infrastructure
management
to
turn-key
system
services
may
give
the
impression
that
these
various
cloud
models
provide
a
comprehensive
platform
for
deploying
enterprise-scale
applications.
Such
is
not
always
the
case.
Application
designers
still
have
multiple
areas
of
life
cycle
management
and
performance
management
to
consider
when
deploying
such
applications.
Server
failover
Application
server
replication
Content
caching
Network
optimization
Cloud
providers
offer
solutions,
or
building
blocks
for
solutions
for
some
of
these
but
as
we
will
see,
there
is
no
single
approach
that
will
work
for
all
plausible
use
cases.
90
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
A
server-level
failure
could
take
an
application
offline
unless
there
is
a
mechanism
in
place
to
shift
the
workload
from
the
failed
server
to
a
functioning
server
running
the
same
application.
One
simple
way
to
do
this
is
to
deploy
a
cluster
of
servers
running
the
same
software
and
route
traffic
to
those
servers
through
a
load
balancer.
The
load
balancer
distributes
the
load
across
servers
in
the
cluster.
If
the
load
balancer
detects
a
failure
in
one
of
the
servers
(for
example,
the
server
does
not
respond
to
a
ping
request),
the
load
balancer
can
route
traffic
to
other
servers
in
the
cluster
until
the
failed
server
is
back
online.
Figure
5.5:
A
load
balancer
provides
a
simple
but
often
effective
form
of
failover
in
a
cluster
of
servers.
91
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Enterprise
applications
may
require
redundancy
across
data
centers.
If
application
servers
within
a
single
data
center
are
unavailable,
traffic
to
those
servers
can
be
routed
to
an
alternative
data
center.
This
situation
is
not
ideal,
of
course.
Depending
on
the
distance,
peering
arrangements
between
ISPs,
and
other
network
configuration
issues,
there
may
be
longer
latency
for
users
working
with
a
distant
data
center.
The
additional
workload
on
the
application
servers
in
the
redundant
data
center
might
also
slow
application
response
time.
Adding
more
servers
to
the
application
server
cluster
may
help
alleviate
some
of
this
problem,
but
there
may
be
a
limited
number
of
servers
available
to
add
to
the
cluster.
Such
might
be
especially
true
if
a
large
data
center
is
experiencing
a
failure
and
multiple
enterprise
applications
are
shifting
workloads
to
the
redundant
data
center.
Figure
5.6:
Multiple
data
centers
can
provide
failover
recovery
in
the
event
of
a
catastrophic
failure
at
the
data
center
level
(Image
source:
CDNetworks).
The
first
thoughts
about
failover
recovery
may
be
focused
on
application
servers
and
ensuring
that
backup
servers
are
available.
These
items
are
a
necessary
part
of
failover
recovery
but
are
not
the
only
crucial
components.
In
addition
to
application
servers,
data
must
be
accessible
to
the
failover
servers.
When
a
single
server
fails
in
a
cluster,
the
other
servers
will
still
have
access
to
data
on
storage
arrays
in
the
data
center.
To
ensure
the
ability
to
failover
between
data
centers,
you
must
make
sure
data
is
replicated
between
the
data
centers.
92
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Content
Caching
Application
designers
can
take
advantage
of
the
properties
of
static
content
to
improve
application
performance.
Static
content
is
any
type
of
data
that
can
be
generated
and
stored
for
use
at
some
time
in
the
future.
This
type
encompasses
content
ranging
from
Web
pages
to
data
files
that
rarely
change.
Static
content
changes
infrequently,
so
there
is
minimal
risk
to
maintaining
multiple
copies.
Caching
is
an
efficient
strategy
for
a
number
of
reasons.
By
keeping
a
copy
of
static
content
in
multiple
locations,
users
can
receive
data
from
the
closest
location
and
therefore
typically
reduce
latency.
Static
data
is
stored
in
a
local
cache
after
the
first
time
it
is
accessed.
Figure
5.7:
Static
content
rarely
changes,
so
once
it
is
retrieved
from
a
distant
server,
it
can
be
cached
for
future
use
by
other
users
of
the
closer
data
center.
Application
designers
do
not
have
to
devise
a
metric
to
determine
the
content
most
likely
to
be
requested.
Instead,
only
the
content
that
has
been
requested
is
cached.
Consider
a
multi-language
Web
site.
The
number
of
users
requesting
content
in
Finnish
is
likely
to
be
low
outside
of
northern
Europe.
The
data
center
serving
that
region
will
likely
have
cached
copies
of
that
content
while
data
centers
in
other
parts
of
the
global
probably
will
not.
93
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
The
amount
of
memory
dedicated
to
caching
will
determine
the
upper
bound
on
the
amount
of
static
data
that
can
be
a
cached
at
one
time.
An
effective
way
to
work
within
this
limit
is
to
track
the
last
request
time
for
each
content
object.
Older
objects
that
have
not
been
requested
are
good
candidates
for
removing
from
the
cache
without
adversely
impacting
performance.
News
stories
or
major
announcements,
for
example,
may
be
popular
for
a
time
but
eventually
are
surpassed
in
popularity
by
other
newer
stories
or
announcements.
Removing
the
least-used
content
is
just
one
strategy
for
managing
caches.
One
could
consider
the
size
of
an
object
and
develop
a
weighting
scheme
that
favors
retaining
smaller
objects.
The
idea
here
is
that
removing
a
single
large
object
would
allow
multiple
smaller
objects
to
be
stored.
Other
factors,
such
as
the
number
of
times
an
object
is
requested
over
time,
can
also
be
factored
into
the
cache
management
algorithm.
As
useful
as
caching
is,
it
does
not
address
the
need
to
improve
network
performance
when
transmitting
dynamic
content.
Network
Optimization
Application
architects
have
to
consider
many
aspects
of
application
performance
and
reliability.
Network
optimization
is
especially
important
for
enterprise
and
globally
used
applications.
Consider
a
hypothetical
scenario:
An
organization
is
deploying
a
new
financial
market
monitoring
application
to
the
cloud.
The
application
will
have
users
in
North
America,
Europe,
and
Asia.
The
application
collects
nearreal-time
data
from
institutions
across
three
continents.
Up-to-date
information
is
vital
to
the
users
of
this
application,
so
cached
data
will
not
meet
requirements.
A
commodity
trader
in
Chicago,
for
example,
might
want
the
latest
information
on
commodity
prices
in
Hong
Kong.
A
cached
data
set
that
is
2
hours
old
is
essentially
useless
to
the
trader
in
Chicago.
Or,
it
could
be
worse
than
useless.
Making
a
decision
based
on
out-of-date
information
could
lead
to
costly
transactions
that
could
have
been
avoided
with
timely
data.
As
data
must
move
between
global
data
centers
on
an
as-needed
basis,
network
optimizations
are
key
considerations.
Network
optimizations
can
include:
94
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Application
architects
consider
many
factors
ranging
from
server
reliability
and
failover
to
content
caching
and
network
optimizations.
As
more
organizations
adopt
cloud
computing,
infrastructures
can
help
to
consider
how
content
delivery
networks
can
complement
cloud
service
providers.
Distributed
caching
Network
protocol
optimization
Application
delivery
network
services
Secure
content
delivery
Support
for
content
life
cycle
management
Distributed
caching
as
noted
earlier
helps
to
improve
the
performance
of
applications
serving
static
content
while
network
protocol
optimizations
help
with
dynamic
content
applications.
Managing
distributed
applications
is
a
complex
and
demanding
process.
Cloud
providers
have
some
basic
controls
and
platforms
to
support
distributed
application
management,
but
content
delivery
networks
that
also
include
application
delivery
services
can
offer
additional
services.
Discussions
about
enterprise
computing
and
public
clouds
often
include
concerns
about
security
and
compliance.
Here
again,
content
delivery
network
providers
can
complement
and
supplement
the
services
of
public
cloud
providers
with
enterprise
support
for
secure
content
delivery
and
support
for
the
full
life
cycle
of
content
management.
The
next
chapter
will
examine
issues
related
to
choosing
a
content
network
delivery
service.
As
with
other
IT
projects,
making
use
of
public
clouds,
content
delivery
services,
and
application
delivery
services
requires
planning
and
careful
attention
to
integration
issues.
95
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Global
reach
Dynamic
content
acceleration
Security
Architecture
considerations
Key
performance
metrics
Technical
support
Key
business
considerations
These
evaluation
areas
are
generally
applicable
to
enterprise
applications,
but
some
topics
might
be
more
important
than
others.
Application
and
organization
requirements
should
determine
the
weight
applied
to
each
of
these
areas.
For
example,
if
your
primary
concern
is
increasing
the
performance
of
analytic
applications
used
by
customers
across
the
globe,
dynamic
content
acceleration
is
more
important
than
is
static
content
caching.
Consider
the
suite
of
applications
your
organization
is
supporting
as
you
determine
the
relative
importance
of
each
of
these
areas.
Also
keep
in
mind
strategic
plans
and
their
implications
for
system
design.
You
might
find
that
you
have
or
will
likely
have
a
combination
of
applications
that
could
benefit
from
cloud
application
accelerations
services.
96
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Global
Reach
Global
reach
in
the
context
of
cloud
application
acceleration
has
both
a
technical
and
a
business
dimension.
Consider
both
during
your
evaluation.
Figure
6.1:
Content
delivery
networks
require
globally
distributed
edge
servers
to
support
delivery
from
caches
closer
to
end
users.
97
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
6.2:
Popular
sites
outside
of
China
and
those
posting
content
deemed
inappropriate
might
be
blocked
by
the
Chinese
government.
The
censoring
infrastructure
is
commonly
known
as
the
Great
Firewall
of
China.
98
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Consider
how
content
delivery
network
and
network
acceleration
providers
can
assist
you
with
navigating
local
regulations,
complying
with
operational
restrictions
and
responding
to
orders
from
the
government.
As
with
many
other
business
services,
organizations
might
consider
developing
in-house
expertise
to
manage
these
issues.
This
choice
is
reasonable
in
some
casesfor
example,
when
you
have
a
long
history
of
business
in
the
country,
have
in-
depth
knowledge
of
legal
and
cultural
issues,
and
understand
legal
procedures
in
the
country.
When
the
cost
of
developing
expertise
in
local
matters
outweighs
the
benefits,
it
is
appropriate
to
consider
how
your
content
delivery
network
provider
and
network
acceleration
provider
might
be
able
to
assist
you
with
local
matters.
Global
reach
encompasses
both
technical
aspects,
such
as
the
distribution
of
edge
servers
and
the
ability
to
accelerate
network
traffic
over
global
distances,
and
business
aspects,
such
as
support
for
complying
with
local
regulations
around
the
globe.
Lets
next
turn
to
a
more
in-depth
look
at
several
technical
areas.
Figure
6.3:
The
middle
mile
is
the
intermediate
network
infrastructure
between
edge
networks.
99
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
When
evaluating
dynamic
content
acceleration
techniques,
consider
how
your
provider
implements
acceleration.
For
example,
traffic
in
the
middle
mile
of
the
network
between
two
edge
servers
can
be
optimized
because
the
acceleration
network
provider
controls
both
endpoints.
The
servers
can
negotiate
protocol
settings
that
reduce
the
number
of
packets
that
must
be
resent
when
packets
are
dropped
and
set
other
TCP
configuration
parameters
to
reduce
overhead
on
the
network.
The
distribution
of
edge
servers
is
also
a
key
consideration
in
evaluating
dynamic
content
acceleration.
Edge
servers
should
be
distributed
in
ways
that
reach
large
portions
of
the
user
base,
mitigate
the
impact
of
poor
performance
peering
agreements
between
ISPs,
and
maintain
reasonable
loads
on
the
servers.
The
implementation
choices
made
by
cloud
acceleration
providers
are
important
because
they
can
impact
key
application
requirements,
including:
High
availability
Faster
application
performance
Improved
end
user
experience
High
Availability
Customers
expect
business
Web
sites
and
applications
to
be
available
24x7
in
spite
of
hardware
failures,
network
problems,
and
malicious
cyber-attacks.
Cloud
acceleration
providers
can
help
mitigate
the
risk
of
unwanted
downtime
by
providing
high-availability
hardware
and
networks.
Hardware
fails.
Todays
large-scale
data
centers
house
large
numbers
of
servers
and
storage
devices
and
therefore
it
is
reasonable
to
assume
that
at
least
one
component
in
a
large
data
center
will
fail
in
production.
Cloud
acceleration
vendors
can
provide
for
high-
availability
Web
sites
and
applications
with
a
combination
of
failover
clusters,
redundant
storage,
and
multiple
data
centers.
When
high
availability
is
a
requirement
for
an
application,
that
application
may
be
run
in
a
cluster
of
servers.
In
some
cases,
all
servers
in
a
cluster
share
the
application
workload
and
if
one
fails,
the
other
servers
will
continue
to
process
the
workload.
In
other
cases,
a
single
server
may
process
the
full
workload
while
a
stand-by
server
is
constantly
updated
with
the
state
of
the
primary
server.
If
the
primary
server
fails,
the
stand-by
server
takes
over
processing
the
workload.
Cloud
providers
should
support
the
appropriate
type
of
high-
availability
configuration
appropriate
for
your
applications.
Storage
systems
are
also
subject
to
occasional
failures.
Redundant
storage
systems
improve
high
availability
by
reducing
the
chances
that
a
storage
failure
will
lead
to
application
downtime.
100
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Large-scale
network
problems
and
natural
disasters
can
result
in
large-scale
disruptions
to
a
data
center.
Cloud
acceleration
providers
with
multiple
data
centers
can
continue
to
provide
access
to
applications
by
routing
traffic
away
from
the
disrupted
data
centers
to
other
data
centers
hosting
the
same
content
or
able
to
run
the
applications
that
had
been
available
from
the
disrupted
data
center.
Security
Considerations
Security
is
a
broad
topic
that
encompasses
confidentiality,
integrity,
and
availability
of
data
and
applications
and
applies
to
virtually
all
aspect
of
information
technology.
Cloud
acceleration
is
no
exception.
Several
topics
are
of
particular
importance
with
regards
to
content
distribution
networks
and
dynamic
content
accelerations:
101
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
As
you
plan
to
deploy
content
delivery
networks
and
dynamic
content
acceleration,
consider
which
content
should
be
encrypted
during
transmission.
If
you
organization
has
a
data
classification
system
in
place,
that
system
can
inform
you
about
the
types
of
content
that
might
need
to
be
encrypted.
For
example,
data
subject
to
government
or
industry
regulations
may
require
strong
encryption
anytime
it
is
transmitted
over
the
Internet.
In
other
cases,
data
classified
as
publicthat
is,
data
that
if
released
to
the
public
would
not
cause
any
harm
to
the
organizationcan
be
transmitted
without
encryption.
Accelerating
Encryption
SSL
takes
a
best-of-both-worlds
approach
and
uses
both
asymmetric
and
symmetric
key
cryptography.
Asymmetric
cryptography
is
used
during
the
SSL
handshake
(see
Figure
6.8)
when
two
devices
are
establishing
an
encrypted
session.
During
the
handshake,
the
two
devices
exchange
information
about
the
algorithms
and
other
parameters
that
each
supports.
Asymmetric
techniques
are
used,
so
this
communication
can
occur
over
a
secured
channel.
During
the
handshake,
the
devices
exchange
a
symmetric
key
that
is
used
to
encrypt
data
for
the
rest
of
the
session.
Encrypting
data,
especially
during
the
handshake
using
asymmetric
encryption,
is
computationally
demanding.
Encrypting
large
volumes
of
data
over
many
different
sessions
can
place
heavy
demands
on
CPUs.
Edge
servers
providing
access
to
encrypted
data
can
benefit
from
SSL
acceleration.
102
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Figure
6.4:
In
a
typical
TCP
handshake,
a
client
initiates
a
connection
with
a
SYN
packet,
the
server
responds
with
a
SYN-ACK
and
the
client
then
responds
with
an
ACK.
103
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
In
the
case
of
a
malicious
attack,
the
client
does
not
respond
and
the
server
is
left
holding
the
connection
open
while
it
waits
for
a
response.
Eventually,
the
connection
will
time
out,
but
the
during
that
time,
the
attacker
will
have
issued
other
connection
requests
ultimately
consuming
all
connection
resources.
As
a
result,
legitimate
traffic
is
unable
to
establish
connections
to
the
server
(see
Figure
6.5).
Figure
6.5:
In
a
malicious
DoS
attack,
the
clients
flood
the
server
with
SYN
packets
leading
to
corresponding
SYN-ACK,
which
are
unacknowledged
by
the
attacker.
As
a
result,
the
server
waits
for
acknowledgement
while
connection
resources
are
consumed
for
non-functional
connections.
DDoS
attacks
use
a
collection
of
compromised
devices,
known
as
a
botnet,
to
flood
a
target
server
(see
Figure
6.6).
The
compromised
devices
have
been
infected
with
malware
that
allows
the
attacker
to
issue
commands
to
the
compromised
computers.
These
commands
specify
the
type
of
attack
to
launch
and
the
target
server.
In
addition
to
the
compromised
computers
that
are
flooding
servers
with
malicious
traffic,
botnets
often
include
multiple
command
and
control
servers.
The
person
controlling
the
botnet,
known
as
the
bot
herder,
communicates
with
command
and
control
servers,
which
in
turn
communicate
with
compromised
devices.
104
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
One
way
to
disrupt
a
botnet
is
to
shut
down
or
isolate
the
command
and
control
server
so
that
it
can
no
longer
issue
commands.
Botnet
designers
have
recognized
this
potential
single
point
of
failure
and
have
developed
techniques
to
support
multiple
command
and
control
servers.
If
one
is
identified
and
taken
offline,
another
can
assume
the
responsibilities
of
communicating
with
compromised
devices.
As
a
result,
botnets
are
resilient
to
attacks
on
their
infrastructure.
Figure
6.6:
Botnets
are
distributed
systems
controlled
by
multiple
command
and
control
systems,
making
them
difficult
to
disrupt
by
taking
down
bots
or
command
and
control
servers.
105
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
One
of
the
reasons
for
the
growing
threat
of
DDoS
attacks
is
that
they
are
relatively
easy
to
launch.
Information
on
how
to
launch
a
DDoS
attack
is
readily
available
online.
DDoS
application
code
is
available
as
well.
Even
those
without
the
technical
skill
to
implement
their
own
attack
can
find
DDoS
service
providers
on
the
cybercrime
black
market
who
have
their
own
DDoS
infrastructure
and
launch
attacks
for
others.
Figure
6.7:
DDoS
absorption
blocks
malicious
DoS
traffic
before
it
reaches
application
servers.
A
cloud
acceleration
provider
should
support
DDoS
attack
mitigation.
In
addition
to
DDoS
absorption
devices,
procedures
should
be
in
place
to
notify
network
engineers
of
an
attack,
provide
detailed
data
about
the
attack,
and
support
additional
mitigation
measures,
such
as
using
alternative
data
centers
to
maintain
access
to
Web
applications.
106
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Data
Security
Maintaining
confidentiality
of
data
with
SSL
and
ensuring
availability
of
applications
with
DDoS
attack
mitigation
technologies
are
two
key
security
considerations
when
evaluating
cloud
acceleration
providers.
A
third
key
consideration
is
ensuring
data
security
with
regards
to
government
and
industry
regulations.
Businesses
and
government
agencies
may
be
subject
to
multiple
data
protection
regulations
such
as
the
Sarbanes
Oxley
(SOX)
Act,
the
Health
Insurance
Portability
and
Accountability
Act
(HIPAA),
the
Payment
Card
Industry
Data
Security
Standard
(PCI
DSS),
and
others.
These
regulations
have
specific
requirements
for
protecting
the
confidentiality
and
integrity
of
data.
Cloud
acceleration
providers
should
offer
sufficient
controls
to
allow
customers
to
meet
regulatory
requirements.
Although
regulations
vary
in
their
requirements,
there
are
common
characteristics
such
as
the
need
for
access
controls,
confidentiality
measures,
and
the
ability
to
demonstrate
compliance.
As
part
of
your
evaluation,
verify
cloud
acceleration
vendors
are
in
compliance
with
relevant
regulations
for
your
business.
Authentication
Authentication
is
one
area
of
security
that
sounds
fairly
straightforward
but
can
be
riddled
with
organizational
challenges.
Authentication
is
the
process
of
verifying
the
identity
of
a
user
prior
to
granting
that
user
access
to
data,
applications,
and
systems.
This
process
requires
an
authentication
infrastructure
that
supports:
Architecture
Considerations
The
architecture
of
the
content
delivery
network
and
dynamic
content
acceleration
network
is
another
area
to
assess
when
evaluating
potential
providers.
Consider
edge
server
locations,
protocol
optimizations,
and
key
performance
metrics.
The
location
of
edge
servers
helps
to
shape
the
overall
performance
of
your
applications.
Edge
servers
that
are
in
close
physical
proximity
to
users
will
help
reduce
latency
because
packets
have
shorter
distances
to
travel.
Edge
servers
on
networks
with
high-performance
peering
agreements
are
less
likely
to
be
subject
to
degraded
performance
when
using
other
ISPs
networks.
107
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Protocol
optimizations
are
especially
important
for
dynamic
content
acceleration.
TCP
has
changed
over
time
to
support
several
types
of
optimizations
that
can
improve
throughput.
These
optimizations
can
benefit
both
static
and
dynamic
content
because
they
are
typically
applied
to
network
traffic
between
edge
servers.
Together,
the
location
of
edge
servers
and
protocol
optimizations
can
improve
the
overall
performance
of
your
applications.
To
quantify
those
improvements,
look
to
multiple
key
performance
metrics:
108
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Deployment
timeHow
quickly
can
you
reach
new
geographic
regions
or
increase
scalability
in
a
region?
Management
supportHow
does
the
provider
support
your
management
of
the
network?
Is
technical
support
available
24x7?
Analysis
and
reportingWhat
types
of
reports
are
available
to
help
you
manage
ongoing
operations?
Are
reports
sufficient
to
support
compliance
with
regulations?
Ideally,
your
provider
will
be
able
to
assume
the
role
of
expert
for
managing
implementation
details
of
the
content
delivery
network.
In
some
cases,
they
can
also
become
intermediaries
dealing
with
government
regulations.
Of
course,
these
technical
considerations
are
all
designed
to
support
core
business
requirements
related
to
meeting
customer
needs
and
expectations.
109
The
Definitive
Guide
to
Cloud
Acceleration
Dan
Sullivan
Ultimately,
poor
application
performance
translates
into
lower
revenues.
For
example,
a
study
by
the
Aberdeen
Group
found
that
a
1-second
delay
in
page
loading
led
to
an
11%
drop
in
page
views
and
a
7%
loss
in
sales.
Even
when
customers
finish
a
transaction
on
a
poorly
performing
site,
their
chances
of
returning
drop
significantly.
79%
of
customers
are
less
likely
to
purchase
from
a
vendor
in
the
future
if
they
are
dissatisfied
with
the
Web
sites
performance.
Accelerating
application
performance
allows
application
designers
to
continue
to
innovate
and
deliver
quality
user
experiences.
Just
as
important,
it
provides
the
means
to
maintain
performance
required
to
reduce
the
risk
that
customers
will
abandon
shopping
carts,
switch
to
competitor
sites,
or
otherwise
abandon
an
application
or
site.
Summary
Delivering
applications
to
a
global
user
base
is
challenging.
You
will
face
technical
difficulties
as
well
as
cultural
issues.
Business
needs
are
driving
the
adoption
of
cloud
acceleration
to
improve
the
overall
performance
of
applications.
Technical
considerations
are
best
addressed
with
a
combination
of
data
centers,
content
delivery
networks,
and
dynamic
content
acceleration
techniques.
As
this
chapter
has
outlined,
there
are
multiple
considerations
to
evaluate
when
assessing
content
delivery
network
providers.
The
importance
of
particular
considerations
will
vary
according
to
your
specific
business
requirements.
Considering
the
full
range
of
technical,
business,
and
cultural
issues
you
face
in
delivering
content
to
a
global
user
base
will
help
you
evaluate
your
content
delivery
network
and
cloud
application
acceleration
options.
110