Mod - Python Manual
Mod - Python Manual
Mod - Python Manual
Release 3.3.1
Gregory Trubetskoy
E-mail: [email protected]
Copyright
c 2004 Apache Software Foundation.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an
“AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under the License.
Abstract
Mod python allows embedding Python within the Apache server for a considerable boost in performance and added
flexibility in designing web based applications.
This document aims to be the only necessary and authoritative source of information about mod python, usable as a
comprehensive reference, a user guide and a tutorial all-in-one.
See Also:
Python Language Web Site
(http://www.python.org/)
for information on the Python language
Apache Server Web Site
(http://httpd.apache.org/)
for information on the Apache server
CONTENTS
1 Introduction 1
1.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Flexibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Installation 5
2.1 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Installing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Tutorial 11
3.1 A Quick Start with the Publisher Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Quick Overview of how Apache Handles Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 So what Exactly does Mod-python do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Now something More Complicated - Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Your Own 404 Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Python API 19
4.1 Multiple Interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Overview of a Request Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Overview of a Filter Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Overview of a Connection Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 apache – Access to Apache Internals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.6 util – Miscellaneous Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.7 Cookie – HTTP State Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.8 Session – Session Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.9 psp – Python Server Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
i
6.4 Pre-propulating Globals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.5 Conditional Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.6 Enabling INCLUDES Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7 Standard Handlers 81
7.1 Publisher Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.2 PSP Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.3 CGI Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8 Security 87
Index 103
ii
CHAPTER
ONE
Introduction
1.1 Performance
One of the main advantages of mod python is the increase in performance over traditional CGI. Below are results of
a very crude test. The test was done on a 1.2GHz Pentium machine running Red Hat Linux 7.3. Ab was used to poll 4
kinds of scripts, all of which imported the standard cgi module (because this is how a typical Python cgi script begins),
then output a single word ‘Hello!’. The results are based on 10000 requests with concurrency of 1.
1.2 Flexibility
Apache processes requests in phases (e.g. read the request, parse headers, check access, etc.). These phases can
be implemented by functions called handlers. Traditionally, handlers are written in C and compiled into Apache
modules. Mod python provides a way to extend Apache functionality by writing Apache handlers in Python. For a
detailed description of the Apache request processing process, see the Apache API Notes, as well as the Mod python
- Integrating Python with Apache paper.
To ease migration from CGI, a standard mod python handler is provided that simulates the CGI environment allowing
a user to run legacy scripts under mod python with no changes to the code (in most cases).
See Also:
http://dev.apache.org/
Apache Developer Resources
http://www.modpython.org/python10/
Mod Python - Integrating Python with Apache, presented at Python 10
1.3 History
Mod python originates from a project called Httpdapy (1997). For a long time Httpdapy was not called mod python
because Httpdapy was not meant to be Apache-specific. Httpdapy was designed to be cross-platform and in fact was
1
initially written for the Netscape server (back then it was called Nsapy (1997).
Nsapy itself was based on an original concept and first code by Aaron Watters from ”Internet Programming with
Python” by Aaron Watters, Guido Van Rossum and James C. Ahlstrom, ISBN 1-55851-484-8.
Without Aaron’s inspiration, there would be no mod python. Quoting from the Httpdapy README file:
Although Nsapy only worked with Netscape servers, it was very generic in its
design and was based on some brilliant ideas that weren’t necessarily Netscape
specific. Its design is a combination of extensibility, simplicity and
efficiency that takes advantage of many of the key benefits of Python and is
totally in the spirit of Python.
This excerpt from the Httpdapy README file describes well the challenges and the solution provided by embedding
Python within the HTTP server:
Around the same time the Internet Programming With Python book came
out and the chapter describing how to embed Python within Netscape
server immediately caught my attention. I used the example in my
project, and developed an improved version of what I later called
Nsapy that compiled on both Windows NT and Solaris.
...continuing this saga, yours truly later learned that writing Httpdapy for every server is a task a little bigger and less
interesting than I originally imagined.
Instead, it seemed like providing a Python counterpart to the popular Perl Apache extension mod perl that would give
Python users the same (or better) capability would be a much more exciting thing to do.
2 Chapter 1. Introduction
And so it was done. The first release of mod python happened in May of 2000.
1.3. History 3
4
CHAPTER
TWO
Installation
Note: By far the best place to get help with installation and other issues is the mod python mailing list. Please take
a moment to join the mod python mailing list by sending an e-mail with the word ‘subscribe’ in the subject to
mod [email protected].
Also check out Graham Dumpleton’s assorted articles on mod python at
http://www.dscpl.com.au/wiki/ModPython/Articles. These include alternate instructions for getting a first mod python
handler working, as well as articles covering problems, short comings and constraints in various versions of
mod python.
2.1 Prerequisites
• Python 2.3.4 or later. Python versions less than 2.3 will not work.
• Apache 2.0.54 or later. Apache versions 2.0.47 to 2.0.53 may work but have not been tested with this release.
(For Apache 1.3.x, use mod python version 2.7.x).
In order to compile mod python you will need to have the include files for both Apache and Python, as well as the
Python library installed on your system. If you installed Python and Apache from source, then you already have
everything needed. However, if you are using prepackaged software (e.g. Red Hat Linux RPM, Debian, or Solaris
packages from sunsite, etc) then chances are, you have just the binaries and not the sources on your system. Often, the
Apache and Python include files and libraries necessary to compile mod python are part of separate “development”
package. If you are not sure whether you have all the necessary files, either compile and install Python and Apache
from source, or refer to the documentation for your system on how to get the development packages.
2.2 Compiling
There are two ways in which modules can be compiled and linked to Apache - statically, or as a DSO (Dynamic Shared
Object).
DSO is a more popular approach nowadays and is the recommended one for mod python. The module gets compiled
as a shared library which is dynamically loaded by the server at run time.
The advantage of DSO is that a module can be installed without recompiling Apache and used as needed. A more
detailed description of the Apache DSO mechanism is available at http://httpd.apache.org/docs-2.0/dso.html.
At this time only DSO is supported by mod python.
Static linking is an older approach. With dynamic linking available on most platforms it is used less and less. The
main drawback is that it entails recompiling Apache, which in many instances is not a favorable option.
5
2.2.1 Running ./configure
The ./configure script will analyze your environment and create custom Makefiles particular to your system. Aside
from all the standard autoconf stuff, ./configure does the following:
• Finds out whether a program called apxs is available. This program is part of the standard Apache distribution,
and is necessary for DSO compilation. If apxs cannot be found in your PATH or in /usr/local/apache/bin, DSO
compilation will not be available.
You can manually specify the location of apxs by using the --with-apxs option, e.g.:
$ ./configure --with-apxs=/usr/local/apache/bin/apxs
$ ./configure --with-python=/usr/local/bin/python2.3
• Sets the directory for the apache mutex locks. The default is /tmp. The directory must exist and be writable by
the owner of the apache process.
Use --with-mutex-dir option, e.g:
$ ./configure --with-mutex-dir=/var/run/mod_python
The mutex directory can also be specified in using a PythonOption directive. See Configuring Apache.
New in version 3.3.0
• Sets the maximum number of locks reserved by mod python.
The mutexes used for locking are a limited resource on some systems. Increasing the maximum number of
locks may increase performance when using session locking. The default is 8. A reasonable number for higher
performance would be 32. Use --with-max-locks option, e.g:
$ ./configure --with-max-locks=32
The number of locks can also be specified in using a PythonOption directive. See Configuring Apache.
New in version 3.2.0
• Attempts to locate flex and determine its version. If flex cannot be found in your PATH configure will fail. If
the wrong version is found configure will generate a warning. You can generally ignore this warning unless you
need to re-create src/psp parser.c.
6 Chapter 2. Installation
The parser used by psp (See 4.9) is written in C generated using flex. This requires a reentrant version of flex
which at this time is 2.5.31. Most platforms however ship with version 2.5.4 which is not suitable, so a pre-
generated copy of psp parser.c is included with the source. If you do need to compile src/psp parser.c you
must get the correct flex version.
If the first flex binary in the path is not suitable or not the one desired you can specify an alternative location
with the --with-flex option, e.g:
$ ./configure --with-flex=/usr/local/bin/flex
$ ./configure --with-python-src=/usr/src/python2.3
$ make
2.3 Installing
$ su
# make install
– This will simply copy the library into your Apache libexec directory, where all the other modules are.
– Lastly, it will install the Python libraries in site-packages and compile them.
NB: If you wish to selectively install just the Python libraries or the DSO (which may not always require
superuser privileges), you can use the following make targets: install py lib and install dso
2.3. Installing 7
2.3.2 Configuring Apache
LoadModule
If you compiled mod python as a DSO, you will need to tell Apache to load the module by adding the following
line in the Apache configuration file, usually called httpd.conf or apache.conf:
The actual path to mod python.so may vary, but make install should report at the very end exactly where
mod python.so was placed and how the LoadModule directive should appear.
Mutex Directory
The default directory for mutex lock files is /tmp. The default value can be be specified at compile time using
./configure —-with-mutex-dir.
Alternatively this value can be overriden at apache startup using a PythonOption.
This may only be used in the server configuration context. It will be ignored if used in a directory, virtual
host, htaccess or location context. The most logical place for this directive in your apache configuration file is
immediately following the LoadModule directive.
New in version 3.3.0
Mutex Locks
Mutexes are used in mod python for session locking. The default value is 8.
On some systems the locking mechanism chosen uses valuable system resources. Notably on RH 8 sysv ipc is
used, which by default provides only 128 semaphores system-wide. On many other systems flock is used which
may result in a relatively large number of open files.
The optimal number of necessary locks is not clear. Increasing the maximum number of locks may increase
performance when using session locking. A reasonable number for higher performance might be 32.
The maximum number of locks can be specified at compile time using ./configure —-with-max-locks.
Alternatively this value can be overriden at apache startup using a PythonOption.
PythonOption mod_python.mutex_locks 8
This may only be used in the server configuration context. It will be ignored if used in a directory, virtual
host, htaccess or location context. The most logical place for this directive in your apache configuration file is
immediately following the LoadModule directive.
New in version 3.3.0
2.4 Testing
Warning : These instructions are meant to be followed if you are using mod python 3.x or later. If you are using
mod python 2.7.x (namely, if you are using Apache 1.3.x), please refer to the proper documentation.
8 Chapter 2. Installation
1. Make some directory that would be visible on your web site, for example, htdocs/test.
2. Add the following Apache directives, which can appear in either the main server configuration file, or .htac-
cess. If you are going to be using the .htaccess file, you will not need the <Directory> tag below (the
directory then becomes the one in which the .htaccess file is located), and you will need to make sure the
AllowOverride directive applicable to this directory has at least FileInfo specified. (The default is
None, which will not work.)
<Directory /some/directory/htdocs/test>
AddHandler mod_python .py
PythonHandler mptest
PythonDebug On
</Directory>
(Substitute /some/directory above for something applicable to your system, usually your Apache ServerRoot)
3. This redirects all requests for URLs ending in .py to the mod python handler. mod python receives those
requests and looks for an appropriate PythonHandler to handle them. Here, there is a single PythonHandler
directive defining mptest as the python handler to use. We’ll see next how this python handler is defined.
4. At this time, if you made changes to the main configuration file, you will need to restart Apache in order for the
changes to take effect.
5. Edit mptest.py file in the htdocs/test directory so that is has the following lines (be careful when cutting and
pasting from your browser, you may end up with incorrect indentation and a syntax error):
from mod_python import apache
def handler(req):
req.content_type = ’text/plain’
req.write("Hello World!")
return apache.OK
6. Point your browser to the URL referring to the mptest.py; you should see ‘Hello World!’. If you didn’t -
refer to the troubleshooting section next.
7. Note that according to the configuration written above, you can also point your browser to any URL ending in
.py in the test directory. You can for example point your browser to /test/foobar.py and it will be handled by
mptest.py. That’s because you explicitely set the handler to always be mptest, whatever the requested file was.
If you want to have many handler files named handler1.py, handler2.py and so on, and have them accessible
on /test/handler1.py, /test/handler2.py, etc., then you have to use a higher level handler system such as the
mod python publisher (see 3.1), mpservlets or Vampire. Those are just special mod python handler that know
how to map requests to a dynamically loaded handler.
8. If everything worked well, move on to Chapter 3, Tutorial.
See Also:
http://www.astro.umass.edu/%7edpopowich/python/mpservlets/
mpservlets
http://www.dscpl.com.au/projects/vampire
Vampire
2.4. Testing 9
2.5 Troubleshooting
There are a few things you can try to identify the problem:
./httpd -X
This prevents it from backgrounding itself and may provide some useful information.
• Beginning with mod python 3.2.0, you can use the mod python.testhandler to diagnose your configuration.
Add this to your httpd.conf file :
<Location /mpinfo>
SetHandler mod_python
PythonHandler mod_python.testhandler
</Location>
Now point your browser to the /mpinfo URL (e.g. http://localhost/mpinfo) and note down the information given.
This will help you reporting your problem to the mod python list.
• Ask on the mod python list. Make sure to provide specifics such as:
10 Chapter 2. Installation
CHAPTER
THREE
Tutorial
This is a quick guide to getting started with mod python programming once you have it installed. This is not an
installation manual!
It is also highly recommended to read (at least the top part of) Section 4, Python API after completing this tutorial.
The following example will demonstrate a simple feedback form. The form will ask for the name, e-mail address
and a comment and construct an e-mail to the webmaster using the information submitted by the user. This simple
application consists of two files: form.html - the form to collect the data, and form.py - the target of the form’s action.
Here is the html for the form:
<html>
Please provide feedback below:
<p>
<form action="form.py/email" method="POST">
</form>
</html>
11
Note the action element of the <form> tag points to form.py/email. We are going to create a file called
form.py, like this:
import smtplib
%s
Thank You,
%s
# send it out
conn = smtplib.SMTP(SMTP_SERVER)
conn.sendmail(email, [WEBMASTER], msg)
conn.quit()
Dear %s,<br>
Thank You for your kind comments, we
will get back to you shortly.
</html>""" % name
return s
When the user clicks the Submit button, the publisher handler will load the email function in the form module,
passing it the form fields as keyword arguments. It will also pass the request object as req.
Note that you do not have to have req as one of the arguments if you do not need it. The publisher handler is smart
enough to pass your function only those arguments that it will accept.
The data is sent back to the browser via the return value of the function.
Even though the Publisher handler simplifies mod python programming a great deal, all the power of mod python
is still available to this program, since it has access to the request object. You can do all the same things you can
do with a “native” mod python handler, e.g. set custom headers via req.headers out, return errors by rais-
12 Chapter 3. Tutorial
ing apache.SERVER ERROR exceptions, write or read directly to and from the client via req.write() and
req.read(), etc.
Read Section 7.1 Publisher Handler for more information on the publisher handler.
def handler(req):
req.content_type = "text/plain"
req.write("Hello World!")
return apache.OK
1. If not already done, prepend the directory in which the PythonHandler directive was found to sys.path.
2. Attempt to import a module by name myscript. (Note that if myscript was in a subdirectory of the
directory where PythonHandler was specified, then the import would not work because said subdirectory
would not be in the sys.path. One way around this is to use package notation, e.g. ‘PythonHandler
subdir.myscript’.)
3. Look for a function called handler in myscript.
4. Call the function, passing it a request object. (More on what a request object is later)
5. At this point we’re inside the script:
• from mod_python import apache
This imports the apache module which provides us the interface to Apache. With a few rare exceptions,
every mod python program will have this line.
• def handler(req):
This is our handler function declaration. It is called ‘handler’ because mod python takes the name of
the directive, converts it to lower case and removes the word ‘python’. Thus ‘PythonHandler’ be-
comes ‘handler’. You could name it something else, and specify it explicitly in the directive using ‘::’.
For example, if the handler function was called ‘spam’, then the directive would be ‘PythonHandler
myscript::spam’.
Note that a handler must take one argument - the request object. The request object is an object that
provides all of the information about this particular request - such as the IP of client, the headers, the URI,
etc. The communication back to the client is also done via the request object, i.e. there is no “response”
object.
•
req.content_type = "text/plain"
This sets the content type to ‘text/plain’. The default is usually ‘text/html’, but since our handler
doesn’t produce any html, ‘text/plain’ is more appropriate. Important: you should always make
sure this is set before any call to ‘req.write’. When you first call ‘req.write’, the response HTTP
header is sent to the client and all subsequent changes to the content type (or other HTTP headers) are
simply lost.
•
req.write("Hello World!")
This writes the ‘Hello World!’ string to the client. (Did I really have to explain this one?)
14 Chapter 3. Tutorial
•
return apache.OK
This tells Apache that everything went OK and that the request has been processed. If things
did not go OK, that line could be return apache.HTTP INTERNAL SERVER ERROR or return
apache.HTTP FORBIDDEN. When things do not go OK, Apache will log the error and generate an
error message for the client.
Some food for thought: If you were paying attention, you noticed that the text above didn’t specify that in
order for the handler code to be executed, the URL needs to refer to myscript.py. The only requirement was
that it refers to a .py file. In fact the name of the file doesn’t matter, and the file referred to in the URL
doesn’t have to exist. So, given the above configuration, ‘http://myserver/mywebdir/myscript.py’ and
‘http://myserver/mywebdir/montypython.py’ would give the exact same result. The important thing
to understand here is that a handler augments the server behaviour when processing a specific type of file, not an
individual file.
At this point, if you didn’t understand the above paragraph, go back and read it again, until you do.
Notice that the same script is specified for two different handlers. This is fine, because if you remember, mod python
will look for different functions within that script for the different handlers.
Next, we need to tell Apache that we are using Basic HTTP authentication, and only valid users are allowed (this is
fairly basic Apache stuff, so we’re not going to go into details here). Our config looks like this now:
<Directory /mywebdir>
AddHandler mod_python .py
PythonHandler myscript
PythonAuthenHandler myscript
PythonDebug On
AuthType Basic
AuthName "Restricted Area"
require valid-user
</Directory>
Note that depending on which version of Apache is being used, you may need to set either the AuthAuthoritative
or AuthBasicAuthoritative directive to Off to tell Apache that you want allow the task of performing basic
authentication to fall through to your handler.
def authenhandler(req):
pw = req.get_basic_auth_pw()
user = req.user
•
def authenhandler(req):
This is the handler function declaration. This one is called authenhandler because, as we already described
above, mod python takes the name of the directive (PythonAuthenHandler), drops the word ‘Python’
and converts it lower case.
•
pw = req.get_basic_auth_pw()
This is how we obtain the password. The basic HTTP authentication transmits the password in base64 encoded
form to make it a little bit less obvious. This function decodes the password and returns it as a string. Note that
we have to call this function before obtaining the user name.
•
user = req.user
This is how you obtain the username that the user entered.
•
if user == "spam" and pw == "eggs":
return apache.OK
We compare the values provided by the user, and if they are what we were expecting, we tell Apache to go
ahead and proceed by returning apache.OK. Apache will then consider this phase of the request complete,
and proceed to the next phase. (Which in this case would be handler() if it’s a .py file).
•
else:
return apache.HTTP_UNAUTHORIZED
16 Chapter 3. Tutorial
Else, we tell Apache to return HTTP UNAUTHORIZED to the client, which usually causes the browser to pop a
dialog box asking for username and password.
def handler(req):
if req.filename[-17:] == ’apache-error.html’:
# make Apache report an error and render the error page
return(apache.HTTP_NOT_FOUND)
if req.filename[-18:] == ’handler-error.html’:
# use our own error page
req.status = apache.HTTP_NOT_FOUND
pagebuffer = ’Page not here. Page left, not know where gone.’
else:
# use the contents of a file
pagebuffer = open(req.filename, ’r’).read()
Note that if wishing to returning an error page from a handler phase other than the response handler, the value
apache.DONE must be returned instead of apache.OK. If this is not done, subsequent handler phases will still
be run. The value of apache.DONE indicates that processing of the request should be stopped immediately. If us-
ing stacked response handlers, then apache.DONE should also be returned in that situation to prevent subsequent
handlers registered for that phase being run if appropriate.
FOUR
Python API
19
PEP 0311 - Simplified Global Interpreter Lock Acquisition for Extensions
• apache.OK, meaning this phase of the request was handled by this handler and no errors occurred.
• apache.DECLINED, meaning this handler has not handled this phase of the request to completion and Apache
needs to look for another handler in subsequent modules.
• apache.HTTP ERROR, meaning an HTTP error occurred. HTTP ERROR can be any of the following:
As an alternative to returning an HTTP error code, handlers can signal an error by raising the
apache.SERVER RETURN exception, and providing an HTTP error code as the exception value, e.g.
def requesthandler(req):
req.content_type = "text/plain"
req.write("Hello World!")
return apache.OK
And here is what the code for the ‘capitalize.py’ might look like:
def outputfilter(filter):
s = filter.read()
while s:
filter.write(s.upper())
s = filter.read()
if s is None:
filter.close()
When writing filters, keep in mind that a filter will be called any time anything upstream requests an IO operation, and
the filter has no control over the amount of data passed through it and no notion of where in the request processing it
is called. For example, within a single request, a filter may be called once or five times, and there is no way for the
filter to know beforehand that the request is over and which of calls is last or first for this request, thought encounter
of an EOS (None returned from a read operation) is a fairly strong indication of an end of a request.
Also note that filters may end up being called recursively in subrequests. To avoid the data being altered more than
once, always make sure you are not in a subrequest by examining the req.main value.
For more information on filters, see http://httpd.apache.org/docs-2.0/developer/filters.html.
PythonConnectionHandler echo
def connectionhandler(conn):
while 1:
conn.write(conn.readline())
return apache.OK
mod python.apache module defines the following functions and objects. For a more in-depth look at Apache
internals, see the Apache Developer page
4.5.1 Functions
log error(message[, level, server ])
An interface to the Apache ap log error() function. message is a string with the error message, level is
one of the following flags constants:
APLOG_EMERG
APLOG_ALERT
APLOG_CRIT
APLOG_ERR
APLOG_WARNING
APLOG_NOTICE
APLOG_INFO
APLOG_DEBUG
APLOG_NOERRNO
server is a reference to a req.server object. If server is not specified, then the error will be logged to the
default error log, otherwise it will be written to the error log for the appropriate virtual server. When server is
not specified, the setting of LogLevel does not apply, the LogLevel is dictated by an httpd compile-time default,
usually warn.
If you have a reference to a request object available, consider using req.log error instead, it will prepend
request-specific information such as the source IP of the request to the log entry.
import module(module name[, autoreload=None, log=None, path=None ])
This function can be used to import modules.
Note: This function and the module importer were completely reimplemented in mod python 3.3. If you are
using an older version of mod python do not rely on this documentation and instead refer to the documentation
for the specific version you are using as the new importer does not behave exactly the same and has additional
features.
If you are trying to port code from an older version of mod python to mod python 3.3 and can’t work out why
the new importer is not working for you, you can enable the old module importer for specific Python interpreter
instances by using:
where ’name’ is the name of the interpreter instance or ’*’ for it to be applied to all interpreter instances. This
option should be placed at global context within the main Apache configuration files.
The apache.import module() function is not just a wrapper for the standard Python module import
mechanism. The purpose of the function and the mod python module importer in general, is to provide a means
of being able to import modules based on their exact location, with modules being distinguished based on their
location rather than just the name of the module. Distinguishing modules in this way, rather than by name alone,
means that the same module name can be used for handlers and other code in multiple directories and they will
not interfere with each other.
A secondary feature of the module importer is to implement a means of having modules automatically reloaded
when the corresponding code file has been changed on disk. Having modules be able to be reloaded in this
way means that it is possible to change the code for a web application without having to restart the whole
Apache web server. Although this was always the intent of the module importer, prior to mod python 3.3, its
effectiveness was limited. With mod python 3.3 however, the module reloading feature is much more robust
and will correctly reload parent modules even when it was only a child module what was changed.
When the apache.import module() function is called with just the name of the module, as opposed to a
path to the actual code file for the module, a search has to be made for the module. The first set of directories
that will be checked are those specified by the path argument if supplied.
Where the function is called from another module which had previously been imported by the mod python
importer, the next directory which will be checked will be the same directory as the parent module is located.
Where that same parent module contains a global data variable called mp path containing a list of
directories, those directories will also be searched.
Finally, the mod python module importer will search directories specified by the PythonOption called
mod python.importer.path.
For example:
PythonOption mod_python.importer.path "[’/some/path’]"
The argument to the option must be in the form of a Python list. The enclosing quotes are to ensure that
Apache interprets the argument as a single value. The list must be self contained and cannot reference any prior
value of the option. The list MUST NOT reference sys.path nor should any directory which also appears in
sys.path be listed in the mod python module importer search path.
When searching for the module, a check is made for any code file with the name specified and having a ’.py’
extension. Because only modules implemented as a single file will be found, packages will not be found nor
modules contained within a package.
In any case where a module cannot be found, control is handed off to the standard Python module importer
which will attempt to find the module or package by searching sys.path.
If an init .py file is present and it was necessary to import it to achieve the same result as importing the
root of a true Python package, then init can be used as the module name. For example:
As a true Python package is not being used, if a module in the directory needs to refer to another module in the
same directory, it should use just its name, it should not use any form of dotted path name via the root of the
package as would be the case for true Python packages. Modules in subdirectories can be imported by using a
’/’ separated path where the first part of the path is the name of the subdirectory.
As a new feature in mod python 3.3, when using the standard Python ’import’ statement to import a module, if
the import is being done from a module which was previously imported by the mod python module importer,
it is equivalent to having called apache.import module() directly.
For example:
import name
is equivalent to:
and:
or:
Where the file has an extension, that extension must be supplied. Although it is recommended that code files
still make use of the ’.py’ extension, it is not actually a requirement and an alternate extension can be used. For
example:
To avoid the need to use hard coded absolute path names to modules, a few shortcuts are provided. The first of
these allow for the use of relative path names with respect to the directory the module performing the import is
located within.
For example:
parent = apache.import_module(’../module.py’)
subdir = apache.import_module(’./subdir/module.py’)
Forward slashes must always be used for the prefixes ’./’ and ’../’, even on Windows hosts where native pathname
use a backslash. This convention of using forward slashes is used as that is what Apache normalizes all paths
to internally. If you are using Windows and have been using backward slashes with Directory directives etc,
you are using Apache contrary to what is the accepted norm.
A further shortcut allows paths to be declared relative to what is regarded as the handler root directory. The
handler root directory is the directory context in which the active Python*Handler directive was specified.
If the directive was specified within a Location or VirtualHost directive, or at global server scope, the
handler root will be the relevant document root for the server.
To express paths relative to the handler root, the ’˜/’ prefix should be used. A forward slash must again always
be used, even on Windows.
For example:
parent = apache.import_module(’˜/../module.py’)
subdir = apache.import_module(’˜/subdir/module.py’)
In all cases where a path to the actual code file for a module is given, the path argument is redundant as there is
no need to search through a list of directories to find the module. In these situations, the path is instead taken
to be a list of directories to use as the initial value of the mp path variable contained in the imported
modules instead of an empty path.
This feature can be used to attach a more restrictive search path to a set of modules rather than using the
PythonOption to set a global search path. To do this, the modules should always be imported through
a specific parent module. That module should then always import submodules using paths and supply
mp path as the path argument to subsequent calls to apache.import module() within that mod-
ule. For example:
The parent module may if required extend the value of mp path prior to using it. Any such directories
will be added to those inherited via the path argument. For example:
here = os.path.dirname(__file__)
subdir = os.path.join(here, ’subdir’)
__mp_path__.append(subdir)
In all cases where a search path is being specified which is specific to the mod python module importer, whether
it be specified using the PythonOption called mod python.importer.path, using the path argument
to the apache.import module() function or in the mp path attribute, the prefix ’˜/’ can be used
in a path and that path will be taken as being relative to handler root. For example:
If wishing to refer to the handler root directory itself, then ’˜’ can be used and the trailing slash left off. For
example:
Note that with the new module importer, as directories associated with Python*Handler directives are no
longer being added automatically to sys.path and they are instead used directly by the module importer
only when required, some existing code which expected to be able to import modules in the handler root di-
rectory from a module in a subdirectory may no longer work. In these situations it will be necessary to set
the mod python module importer path to include ’˜’ or list ’˜’ in the mp path attribute of the module
performing the import.
This trick of listing ’˜’ in the module importer path will not however help in the case where Python packages
were previously being placed into the handler root directory. In this case, the Python package should either be
moved out of the document tree and the directory where it is located listed against the PythonPath directive,
or the package converted into the pseudo packages that mod python supports and change the module imports
used to access the package.
Only modules which could be imported by the mod python module importer will be candidates for automatic
reloading when changes are made to the code file on disk. Any modules or packages which were located in a
directory listed in sys.path and which were imported using the standard Python module importer will not be
candidates for reloading.
Even where modules are candidates for module reloading, unless a true value was explicitly supplied as
the autoreload option to the apache.import module() function they will only be reloaded if the
PythonAutoReload directive is On. The default value when the directive is not specified will be On, so
the directive need only be used when wishing to set it to Off to disable automatic reloading, such as in a
production system.
Where possible, the PythonAutoReload directive should only be specified in one place and in the root
context for a specific Python interpreter instance. If the PythonAutoReload directive is used in multiple
places with different values, or doesn’t cover all directories pertaining to a specific Python interpreter instance,
then problems can result. This is because requests against some URLs may result in modules being reloaded
whereas others may not, even when through each URL the same module may be imported from a common
location.
If absolute certainty is required that module reloading is disabled and that it isn’t being enabled through some
subset of URLs, the PythonImport directive should be used to import a special module whenever an Apache
child process is being created. This module should include a call to the apache.freeze modules()
function. This will have the effect of permanently disabling module reloading for the complete life of that
Apache child process, irrespective of what value the PythonAutoReload directive is set to.
Using the new ability within mod python 3.3 to have PythonImport call a specific function within a module
after it has been imported, one could actually dispense with creating a module and instead call the function
directory out of the mod python.apache module. For example:
Where module reloading is being undertaken, unlike the core module importer in versions of mod python prior
to 3.3, they are not reloaded on top of existing modules, but into a completely new module instance. This means
that any code that previously relied on state information or data caches to be preserved across reloads will no
longer work.
If it is necessary to transfer such information from an old module to the new module, it is necessary to provide a
hook function within modules to transfer across the data that must be preserved. The name of this hook function
is mp clone (). The argument given to the hook function will be an empty module into which the new
module will subsequently be loaded.
When called, the hook function should copy any data from the old module to the new module. In doing this,
the code performing the copying should be cognizant of the fact that within a multithreaded Apache MPM that
if not globals().has_key(’_lock’):
# Initial import of this module.
_lock = threading.Lock()
_data1 = { ’value1’ : 0, ’value2’: 0 }
_data2 = {}
def __mp_clone__(module):
_lock.acquire()
module._lock = _lock
module._data1 = _data1
module._data2 = _data2
_lock.release()
Because the old module is about to be discarded, the data which is transferred should not consist of data objects
which are dependent on code within the old module. Data being copied across to the new module should consist
of standard Python data types, or be instances of classes contained within modules which themselves are not
candidates for reloading. Otherwise, data should be migrated by transforming it into some neutral intermediate
state, with the new module transforming it back when its code executes at the time of being imported.
If these guidelines aren’t heeded and data is dependent on code objects within the old module, it will prevent
those code objects from being unloaded and if this continues across multiple reloads, then process size may
increase over time due to old code objects being retained.
In any case, if for some reason the hook function fails and an exception is raised then both the old and new
modules will be discarded. As a last opportunity to release any resources when this occurs, an extra hook
function called mp purge () can be supplied. This function will be called with no arguments.
allow methods([*args ])
A convenience function to set values in req.allowed. req.allowed is a bitmask that is used to construct
the ‘Allow:’ header. It should be set before returning a HTTP NOT IMPLEMENTED error.
Arguments can be one or more of the following:
Example:
if apache.mpm_query(apache.AP_MPMQ_IS_THREADED):
# do something
else:
# do something else
4.5.2 Attributes
interpreter
The name of the subinterpreter under which we’re running. (Read-Only)
main server
A server object for the main server. (Read-Only)
class table([mapping-or-sequence ])
Returns a new empty object of type mp table. See Section 4.5.3 for description of the table object. The
mapping-or-sequence will be used to provide initial values for the table.
The table object is a wrapper around the Apache APR table. The table object behaves very much like a dictionary
(including the Python 2.2 features such as support of the in operator, etc.), with the following differences:
Much of the information that Apache uses is stored in tables. For example, req.headers in and
req.headers out.
All the tables that mod python provides inside the request object are actual mappings to the Apache structures,
so changing the Python table also changes the underlying Apache table.
In addition to normal dictionary-like behavior, the table object also has the following method:
add(key, val)
add() allows for creating duplicate keys, which is useful when multiple headers, such as Set-Cookie:
are required.
New in version 3.0.
The request object is a Python mapping to the Apache request rec structure. When a handler is invoked, it is
always passed a single argument - the request object.
You can dynamically assign attributes to it as a way to communicate between handlers.
Request Methods
if manager:
req.add_handler("PythonHandler", "menu::admin")
else:
req.add_handler("PythonHandler", "menu::basic")
Note: If you pass this function an invalid handler, an exception will be generated at the time an attempt is made
to find the handler.
add input filter(filter name)
Adds the named filter into the input filter chain for the current request. The filter should be added before the
first attempt to read any data from the request.
req.add_output_filter("CONTENT_LENGTH")
req.write("content",0)
•apache.REMOTE HOST Look up the DNS name. Return None if Apache directive
HostNameLookups is off or the hostname cannot be determined.
•apache.REMOTE NAME (Default) Return the DNS name if possible, or the IP (as a string in dotted
decimal notation) otherwise.
•apache.REMOTE NOLOOKUP Don’t perform a DNS lookup, return an IP. Note: if a lookup was per-
formed prior to this call, then the cached host name is returned.
If str is ip is None or unspecified, then the return value is a string representing the DNS name or IP address.
If the optional str is ip argument is not None, then the return value is an (address, str is ip) tuple,
where str is ip is non-zero if address is an IP address string.
On failure, None is returned.
get options()
Returns a reference to the table object containing the options set by the PythonOption directives.
internal redirect(new uri)
Internally redirects the request to the new uri. new uri must be a string.
The httpd server handles internal redirection by creating a new request object and processing all request phases.
Within an internal redirect, req.prev will contain a reference to a request object from which it was redirected.
is https()
Returns non-zero if the connection is using SSL/TLS. Will always return zero if the mod ssl Apache module is
not loaded.
You can use this method during any request phase, unlike looking for the HTTPS variable in the
subprocess env member dictionary. This makes it possible to write an authentication or access handler
that makes decisions based upon whether SSL is being used.
Note that this method will not determine the quality of the encryption being used. For that you should call the
ssl var lookup method to get one of the SSL CIPHER* variables.
log error(message[, level ])
An interface to the Apache ap log rerror function. message is a string with the error message, level is one
of the following flags constants:
APLOG_EMERG
APLOG_ALERT
APLOG_CRIT
APLOG_ERR
APLOG_WARNING
APLOG_NOTICE
APLOG_INFO
APLOG_DEBUG
APLOG_NOERRNO
If you need to write to log and do not have a reference to a request object, use the apache.log error
function.
meets conditions()
Calls the Apache ap meets conditions() function which returns a status code. If status is apache.OK,
generate the content of the response normally. If not, simply return status. Note that mtime (and possibly the
ETag header) should be set as appropriate prior to calling this function. The same goes for req.status if the
status differs from apache.OK.
Example:
status = r.meets_conditions()
if status != apache.OK:
return status
requires()
Returns a tuple of strings of arguments to require directive.
For example, with the following apache configuration:
AuthType Basic
require user joe
require valid-user
SSL_CIPHER
SSL_CLIENT_CERT
SSL_CLIENT_VERIFY
SSL_PROTOCOL
SSL_SESSION_ID
Note: Not all SSL variables are defined or have useful values in every request phase. Also use caution when
relying on these values for security purposes, as SSL or TLS protocol parameters can often be renegotiated at
any time during a request.
update mtime(dependency mtime)
If dependency mtime is later than the value in the mtime attribute, sets the attribute to the new value.
write(string[, flush=1 ])
Writes string directly to the client, then flushes the buffer, unless flush is 0.
flush()
Flushes the output buffer.
Request Members
connection
A connection object associated with this request. See Connection Object below for details. (Read-Only)
server
A server object associate with this request. See Server Object below for details. (Read-Only)
next
If this is an internal redirect, the request object we redirect to. (Read-Only)
prev
If this is an internal redirect, the request object we redirect from. (Read-Only)
main
If this is a sub-request, pointer to the main request. (Read-Only)
the request
String containing the first line of the request. (Read-Only)
assbackwards
Indicates an HTTP/0.9 “simple” request. This means that the response will contain no headers, only the body.
Although this exists for backwards compatibility with obsolescent browsers, some people have figred out that
setting assbackwards to 1 can be a useful technique when including part of the response from an internal redirect
to avoid headers being sent.
proxyreq
A proxy request: one of apache.PROXYREQ * values.
header only
A boolean value indicating HEAD request, as opposed to GET. (Read-Only)
protocol
Protocol, as given by the client, or ‘HTTP/0.9’. Same as CGI SERVER PROTOCOL. (Read-Only)
proto num
Integer. Number version of protocol; 1.1 = 1001 (Read-Only)
hostname
String. Host, as set by full URI or Host: header. (Read-Only)
request time
A long integer. When request started. (Read-Only)
status line
Status line. E.g. ‘200 OK’. (Read-Only)
status
Status. One of apache.HTTP * values.
method
A string containing the method - ’GET’, ’HEAD’, ’POST’, etc. Same as CGI REQUEST METHOD. (Read-
Only)
method number
Integer containing the method number. (Read-Only)
def typehandler(req):
if os.path.splitext(req.filename)[1] == ".py":
req.handler = "mod_python"
req.add_handler("PythonHandler", "mod_python.publisher")
return apache.OK
return apache.DECLINED
content encoding
String. Content encoding. (Read-Only)
vlist validator
Integer. Variant list validator (if negotiated). (Read-Only)
user
If an authentication check is made, this will hold the user name. Same as CGI REMOTE USER.
Note: req.get basic auth pw() must be called prior to using this value.
ap auth type
Authentication type. Same as CGI AUTH TYPE.
no cache
Boolean. This response cannot be cached.
no local copy
Boolean. No local copy exists.
unparsed uri
The URI without any parsing performed. (Read-Only)
uri
The path portion of the URI.
filename
String. File name being requested.
canonical filename
String. The true filename (req.filename is canonicalized if they don’t match).
path info
For backward compatability, the object can also be accessed as if it were a tuple. The apache module defines
a set of FINFO * constants that should be used to access elements of this tuple.
user = req.finfo[apache.FINFO_USER]
parsed uri
Tuple. The URI broken down into pieces. (scheme, hostinfo, user, password, hostname,
port, path, query, fragment). The apache module defines a set of URI * constants that should
be used to access elements of this tuple. Example:
fname = req.parsed_uri[apache.URI_PATH]
(Read-Only)
used path info
Flag to accept or reject path info on current request.
eos sent
Boolean. EOS bucket sent. (Read-Only)
The connection object is a Python mapping to the Apache conn rec structure.
Connection Methods
If you need to write to log and do not have a reference to a connection or request object, use the
apache.log error function.
read([length ])
Reads at most length bytes from the client. The read blocks indefinitely until there is at least one byte to read. If
length is -1, keep reading until the socket is closed from the other end (This is known as EXHAUSTIVE mode
in the http server code).
This method should only be used inside Connection Handlers.
Note: The behaviour of this method has changed since version 3.0.3. In 3.0.3 and prior, this method would
block until length bytes was read.
readline([length ])
Reads a line from the connection or up to length bytes.
This method should only be used inside Connection Handlers.
write(string)
Writes string to the client.
This method should only be used inside Connection Handlers.
Connection Members
base server
A server object for the physical vhost that this connection came in through. (Read-Only)
local addr
The (address, port) tuple for the server. (Read-Only)
remote addr
The (address, port) tuple for the client. (Read-Only)
remote ip
String with the IP of the client. Same as CGI REMOTE ADDR. (Read-Only)
remote host
String. The DNS name of the remote client. None if DNS has not been checked, "" (empty string) if no name
found. Same as CGI REMOTE HOST. (Read-Only)
remote logname
Remote name if using RFC1413 (ident). Same as CGI REMOTE IDENT. (Read-Only)
aborted
Boolean. True is the connection is aborted. (Read-Only)
keepalive
Integer. 1 means the connection will be kept for the next request, 0 means “undecided”, -1 means “fatal error”.
(Read-Only)
A filter object is passed to mod python input and output filters. It is used to obtain filter information, as well as get
and pass information to adjacent filters in the filter stack.
Filter Methods
pass on()
Passes all data through the filter without any processing.
read([length ])
Reads at most len bytes from the next filter, returning a string with the data read or None if End Of Stream
(EOS) has been reached. A filter must be closed once the EOS has been encountered.
If the len argument is negative or omitted, reads all data currently available.
readline([length ])
Reads a line from the next filter or up to length bytes.
write(string)
Writes string to the next filter.
flush()
Flushes the output by sending a FLUSH bucket.
close()
Closes the filter and sends an EOS bucket. Any further IO operations on this filter will throw an exception.
disable()
Tells mod python to ignore the provided handler and just pass the data on. Used internally by mod python to
print traceback from exceptions encountered in filter handlers to avoid an infinite loop.
Filter Members
closed
A boolean value indicating whether a filter is closed. (Read-Only)
name
String. The name under which this filter is registered. (Read-Only)
The request object is a Python mapping to the Apache request rec structure. The server structure describes the
server (possibly virtual server) serving the request.
Server Methods
get config()
Similar to req.get config(), but returns a table object holding only the mod python configuration defined
at global scope within the Apache configuration. That is, outside of the context of any VirtualHost, Location,
Directory or Files directives.
get options()
Similar to req.get options(), but returns a table object holding only the mod python options defined
at global scope within the Apache configuration. That is, outside of the context of any VirtualHost, Location,
Directory or Files directives.
log error(message[, level ])
An interface to the Apache ap log error function. message is a string with the error message, level is one
of the following flags constants:
APLOG_EMERG
APLOG_ALERT
APLOG_CRIT
APLOG_ERR
APLOG_WARNING
APLOG_NOTICE
APLOG_INFO
APLOG_DEBUG
APLOG_NOERRNO
If you need to write to log and do not have a reference to a server or request object, use the
apache.log error function.
register cleanup(request, callable[, data ])
Registers a cleanup. Very similar to req.register cleanup(), except this cleanup will be executed at
child termination time. This function requires the request object be supplied to infer the interpreter name. If
you don’t have any request object at hand, then you must use the apache.register cleanup variant.
Warning: do not pass directly or indirectly a request object in the data parameter. Since the callable will be
called at server shutdown time, the request object won’t exist anymore and any manipulation of it in the callable
will give undefined behaviour.
Server Members
defn name
String. The name of the configuration file where the server definition was found. (Read-Only)
See Also:
Common Gateway Interface RFC Project Page
(http://CGI-Spec.Golux.Com/)
for detailed information on the CGI specification
Access to form data is provided via the FieldStorage class. This class is similar to the standard library module
cgi FieldStorage.
class FieldStorage(req[, keep blank values, strict parsing, file callback, field callback ])
This class provides uniform access to HTML form data submitted by the client. req is an instance of the
mod python request object.
The optional argument keep blank values is a flag indicating whether blank values in URL encoded form data
should be treated as blank strings. The default is false, which means that blank values are ignored as if they
were not included.
The optional argument strict parsing is not yet implemented.
The optional argument file callback allows the application to override both file creation/deletion semantics and
location. See 4.6.2 “FieldStorage Examples” for additional information. New in version 3.2
The optional argument field callback allows the application to override both the creation/deletion semantics
and behaviour. New in version 3.2
During initialization, FieldStorage class reads all of the data provided by the client. Since all data provided
by the client is consumed at this point, there should be no more than one FieldStorage class instanti-
ated per single request, nor should you make any attempts to read client data before or after instantiating a
FieldStorage. A suggested strategy for dealing with this is that any handler should first check for the exis-
tance of a form attribute within the request object. If this exists, it should be taken to be an existing instance
of the FieldStorage class and that should be used. If the attribute does not exist and needs to be created, it
should be cached as the form attribute of the request object so later handler code can use it.
When the FieldStorage class instance is created, the data read from the client is then parsed into separate
fields and packaged in Field objects, one per field. For HTML form inputs of type file, a temporary file is
created that can later be accessed via the file attribute of a Field object.
The FieldStorage class has a mapping object interface, i.e. it can be treated like a dictionary in most
instances, but is not strictly compatible as is it missing some methods provided by dictionaries and some methods
don’t behave entirely like their counterparts, especially when there is more than one value associated with a form
field. When used as a mapping, the keys are form input names, and the returned dictionary value can be:
•An instance of StringField, containing the form input value. This is only when there is a single value
corresponding to the input name. StringField is a subclass of str which provides the additional
value attribute for compatibility with standard library cgi module.
•An instance of a Field class, if the input is a file upload.
•A list of StringField and/or Field objects. This is when multiple values exist, such as for a
<select> HTML form element.
Note: Unlike the standard library cgi module FieldStorage class, a Field object is returned only when
it is a file upload. In all other cases the return is an instance of StringField. This means that you do not
need to use the .value attribute to access values of fields in most cases.
The following examples demonstrate how to use the file callback parameter of the FieldStorage constructor to
control file object creation. The Storage classes created in both examples derive from FileType, thereby providing
extended file functionality.
These examples are provided for demonstration purposes only. The issue of temporary file location and security must
be considered when providing such overrides with mod python in production use.
Simple file control using class constructor This example uses the FieldStorage class constructor to create the
file object, allowing simple control. It is not advisable to add class variables to this if serving multiple sites from
apache. In that case use the factory method instead.
def close(self):
if self.already_deleted:
return
super(Storage, self).close()
if self.delete_on_close:
self.already_deleted = True
os.remove(self.real_filename)
Advanced file control using object factory Using a object factory can provide greater control over the constructor
parameters.
import os
class Storage(file):
def close(self):
if self.already_deleted:
return
super(Storage, self).close()
if self.delete_on_close:
self.already_deleted = True
os.remove(self.real_filename)
class StorageFactory:
file_factory = StorageFactory(someDirectory)
[...sometime later...]
request_data = util.FieldStorage(request, keep_blank_values=True,
file_callback=file_factory.create)
4.7.1 Classes
class Cookie(name, value[, attributes ])
This class is used to construct a single cookie named name and having value as the value. Additionally, any of
the attributes defined in the Netscape specification and RFC2109 can by supplied as keyword arguments.
The attributes of the class represent cookie attributes, and their string representations become part of the string
representation of the cookie. The Cookie class restricts attribute names to only valid values, specifically,
Note: Because this method uses a dictionary, it is not possible to have duplicate cookies. If you would like
to have more than one value in a single cookie, consider using a MarshalCookie.
class SignedCookie(name, value, secret[, attributes ])
This is a subclass of Cookie. This class creates cookies whose name and value are automatically signed using
HMAC (md5) with a provided secret secret, which must be a non-empty string.
parse(string, secret)
This method acts the same way as Cookie.parse(), but also verifies that the cookie is correctly signed.
If the signature cannot be verified, the object returned will be of class Cookie.
Note: Always check the types of objects returned by SignedCookie.parse(). If it is an instance of
Cookie (as opposed to SignedCookie), the signature verification has failed:
4.7.2 Functions
add cookie(req, cookie[, value, attributes ])
This is a convenience function for setting a cookie in request headers. req is a mod python Request object. If
cookie is an instance of Cookie (or subclass thereof), then the cookie is set, otherwise, cookie must be a string,
Here is another:
4.7.3 Examples
def handler(req):
This example checks for incoming marshal cookie and displays it to the client. If no incoming cookie is present a new
marshal cookie is set. This example uses ‘secret007’ as the secret for HMAC signature.
def handler(req):
else:
return apache.OK
4.8.1 Classes
Session(req[, sid, secret, timeout, lock ])
Session() takes the same arguments as BaseSession.
This function returns a instance of the default session class. The session class to be used can be spec-
ified using PythonOption mod python.session.session type value, where value is one of DbmSession,
MemorySession or FileSession. Specifying custom session classes using PythonOption session
is not yet supported.
If session type option is not found, the function queries the MPM and based on that returns either a new instance
of DbmSession or MemorySession. MemorySession will be used if the MPM is threaded and not
id()
Returns the session id.
created()
Returns the session creation time in seconds since beginning of epoch.
last accessed()
Returns last access time in seconds since beginning of epoch.
timeout()
Returns session timeout interval in seconds.
set timeout(secs)
Set timeout to secs.
invalidate()
This method will remove the session from the persistent store and also place a header in outgoing headers
to invalidate the session id cookie.
load()
Load the session values from storage.
save()
This method writes session values to storage.
delete()
Remove the session from storage.
init lock()
This method initializes the session lock. There is no need to ever call this method, it is intended for
subclasses that wish to use an alternative locking mechanism.
lock()
Locks this session. If the session is already locked by another thread/process, wait until that lock is
released. There is no need to call this method if locking is handled automatically (default).
This method registeres a cleanup which always unlocks the session at the end of the request processing.
unlock()
Unlocks this session. (Same as lock() - when locking is handled automatically (default), there is no
need to call this method).
cleanup()
This method is for subclasses to implement session storage cleaning mechanism (i.e. deleting expired ses-
sions, etc.). It will be called at random, the chance of it being called is controlled by CLEANUP CHANCE
Session module variable (default 1000). This means that cleanups will be ordered at random and there is
1 in 1000 chance of it happening. Subclasses implementing this method should not perform the (potentially
time consuming) cleanup operation in this method, but should instead use req.register cleanup
to register a cleanup which will be executed after the request has been processed.
class DbmSession(req, [, dbm, sid, secret, dbmtype, timeout, lock ])
This class provides session storage using a dbm file. Generally, dbm access is very fast, and most dbm imple-
mentations memory-map files for faster access, which makes their performance nearly as fast as direct shared
memory access.
dbm is the name of the dbm file (the file must be writable by the httpd process). This file is not deleted
when the server process is stopped (a nice side benefit of this is that sessions can survive server restarts).
•fast cleanup A boolean value used to turn on FileSession cleanup optimization. Default is True and will
result in reduced cleanup time when there are a large number of session files.
When fast cleanup is True, the modification time for the session file is used to determine if it is a
candidate for deletion. If (current time - file modification time) > (timeout +
grace period), the file will be a candidate for deletion. If verify cleanup is False, no futher checks
will be made and the file will be deleted.
If fast cleanup is False, the session file will unpickled and it’s timeout value used to determine if the
session is a candidate for deletion. fast cleanup = False implies verify cleanup = True.
The timeout used in the fast cleanup calculation is same as the timeout for the session in the current
request running the filesession cleanup. If your session objects are not using the same timeout, or you
are manually setting the timeout for a particular session with set timeout(), you will need to set
verify cleanup = True.
The value of fast cleanup can also be set using PythonOption
mod python.file session.enable fast cleanup.
•verify cleanup Boolean value used to optimize the FileSession cleanup process. Default is True.
If verify cleanup is True, the session file which is being considered for deletion will be unpickled and its
timeout value will be used to decide if the file should be deleted.
When verify cleanup is False, the timeout value for the current session will be used in to determine if the
session has expired. In this case, the session data will not be read from disk, which can lead to a substantial
performance improvement when there are a large number of session files, or where each session is saving
4.8.2 Examples
def handler(req):
session = Session.Session(req)
try:
session[’hits’] += 1
except:
session[’hits’] = 1
session.save()
req.content_type = ’text/plain’
req.write(’Hits: %d\n’ % session[’hits’])
return apache.OK
<html>
<%
import time
%>
Hello world, the time is: <%=time.strftime("%Y-%m-%d, %H:%M:%S")%>
</html>
Internally, the PSP parser would translate the above page into the following Python code:
req.write("""<html>
""")
import time
req.write("""
Hello world, the time is: """); req.write(str(time.strftime("%Y-%m-%d, %H:%M:%S"))); req.write
</html>
""")
This code, when executed inside a handler would result in a page displaying words ‘Hello world, the time
is: ’ followed by current time.
Python code can be used to output parts of the page conditionally or in loops. Blocks are denoted from within Python
code by indentation. The last indentation in Python code (even if it is a comment) will persist through the document
until either end of document or more Python code.
Here is an example:
<html>
<%
for n in range(3):
# This indent will persist
%>
<p>This paragraph will be
repeated 3 times.</p>
<%
# This line will cause the block to end
%>
This line will only be shown once.<br>
</html>
req.write("""<html>
""")
for n in range(3):
# This indent will persist
req.write("""
<p>This paragraph will be
repeated 3 times.</p>
""")
# This line will cause the block to end
req.write("""
This line will only be shown once.<br>
</html>
""")
The parser is also smart enough to figure out the indent if the last line of Python ends with ‘:’ (colon). Considering
this, and that the indent is reset when a newline is encountered inside ‘<%%>’, the above page can be written as:
<html>
<%
for n in range(3):
%>
<p>This paragraph will be
repeated 3 times.</p>
<%
%>
This line will only be shown once.<br>
</html>
However, the above code can be confusing, thus having descriptive comments denoting blocks is highly recommended
as a good practice.
The only directive supported at this time is include, here is how it can be used:
<%@ include file="/file/to/include"%>
If the parse() function was called with the dir argument, then the file can be specified as a relative path, otherwise
it has to be absolute.
class PSP(req, [, filename, string, vars ])
This class represents a PSP object.
req is a request object; filename and string are optional keyword arguments which indicate the source of the PSP
code. Only one of these can be specified. If neither is specified, req.filename is used as filename.
vars is a dictionary of global variables. Vars passed in the run() method will override vars passed in here.
This class is used internally by the PSP handler, but can also be used as a general purpose templating tool.
When a file is used as the source, the code object resulting from the specified file is stored in a memory cache
keyed on file name and file modification time. The cache is global to the Python interpreter. Therefore, unless
the file modification time changes, the file is parsed and resulting code is compiled only once per interpreter.
The cache is limited to 512 pages, which depending on the size of the pages could potentially occupy a significant
amount of memory. If memory is of concern, then you can switch to dbm file caching. Our simple tests showed
only 20% slower performance using bsd db. You will need to check which implementation anydbm defaults
Note that the dbm cache file is not deleted when the server restarts.
Unlike with files, the code objects resulting from a string are cached in memory only. There is no option to
cache in a dbm file at this time.
Note that the above name for the option setting was only changed to this value in mod python 3.3. If you need
to retain backward compatability with older versions of mod python use the PSPDbmCache option instead.
run([vars, flush ])
This method will execute the code (produced at object initialization time by parsing and compiling the PSP
source). Optional argument vars is a dictionary keyed by strings that will be passed in as global variables.
Optional argument flush is a boolean flag indicating whether output should be flushed. The default is not
to flush output.
Additionally, the PSP code will be given global variables req, psp, session and form. A session will
be created and assigned to session variable only if session is referenced in the code (the PSP handler
examines co names of the code object to make that determination). Remember that a mere mention of
session will generate cookies and turn on session locking, which may or may not be what you want.
Similarly, a mod python FieldStorage object will be instantiated if form is referenced in the code.
The object passed in psp is an instance of PSPInterface.
display code()
Returns an HTML-formatted string representing a side-by-side listing of the original PSP code and result-
ing Python code produced by the PSP parser.
Here is an example of how PSP can be used as a templating mechanism:
The template file:
<html>
<!-- This is a simple psp template called template.html -->
<h1>Hello, <%=what%>!</h1>
</html>
def handler(req):
template = psp.PSP(req, filename=’template.html’)
template.run({’what’:’world’})
return apache.OK
class PSPInterface()
An object of this class is passed as a global variable psp to the PSP code. Objects of this class are instantiated
internally and the interface to init is purposely undocumented.
set error page(filename)
Used to set a psp page to be processed when an exception occurs. If the path is absolute, it will be
appended to document root, otherwise the file is assumed to exist in the same directory as the current
page. The error page will receive one additional variable, exception, which is a 3-tuple returned by
sys.exc info().
<%
# note that the ’<’ above is the first byte of the page!
psp.redirect(’http://www.modpython.org’)
%>
Additionally, the psp module provides the following low level functions:
parse(filename[, dir ])
This function will open file named filename, read and parse its content and return a string of resulting Python
code.
If dir is specified, then the ultimate filename to be parsed is constructed by concatenating dir and filename, and
the argument to include directive can be specified as a relative path. (Note that this is a simple concatenation,
no path separator will be inserted if dir does not end with one).
parsestring(string)
This function will parse contents of string and return a string of resulting Python code.
FIVE
PythonAuthzHandler mypackage.mymodule::checkallowed
63
For more information on handlers, see Overview of a Handler.
Side note: The ‘::’ was chosen for performance reasons. In order for Python to use objects inside modules, the
modules first need to be imported. Having the separator as simply a ‘.’, would considerably complicate process of
sequentially evaluating every word to determine whether it is a package, module, class etc. Using the (admittedly
un-Python-like) ‘::’ takes the time consuming work of figuring out where the module part ends and the object inside
of it begins away from mod python resulting in a modest performance gain.
5.1.2 PythonPostReadRequestHandler
5.1.3 PythonTransHandler
5.1.4 PythonHeaderParserHandler
5.1.5 PythonInitHandler
5.1.6 PythonAccessHandler
5.1.7 PythonAuthenHandler
def authenhandler(req):
pw = req.get_basic_auth_pw()
user = req.user
if user == "spam" and pw == "eggs":
return apache.OK
else:
return apache.HTTP_UNAUTHORIZED
Note: req.get basic auth pw() must be called prior to using the req.user value. Apache makes no
attempt to decode the authentication information unless req.get basic auth pw() is called.
5.1.8 PythonAuthzHandler
5.1.9 PythonTypeHandler
5.1.10 PythonFixupHandler
5.1.11 PythonHandler
5.1.12 PythonLogHandler
5.1.13 PythonCleanupHandler
5.2 Filters
5.2.1 PythonInputFilter
5.2. Filters 67
Context: server config
Module: mod python.c
Registers an input filter handler under name name. Handler is a module name optionally followed :: and a callable
object name. If callable object name is omitted, it will default to ‘inputfilter’. Name is the name under which
the filter is registered, by convention filter names are usually in all caps.
The module referred to by the handler can be a full module name (package dot notation is accepted) or an actual
path to a module code file. The module is loaded using the mod python module importer as implemented by the
apache.import module() function. Reference should be made to the documentation of that function for further
details of how module importing is managed.
To activate the filter, use the AddInputFilter directive.
5.2.2 PythonOutputFilter
5.3.1 PythonConnectionHandler
5.4.1 PythonEnablePdb
5.4.2 PythonDebug
5.4.3 PythonImport
5.4.4 PythonInterpPerDirectory
5.4.5 PythonInterpPerDirective
5.4.7 PythonHandlerModule
PythonAuthenHandler mymodule
PythonHandler mymodule
PythonLogHandler mymodule
PythonHandlerModule mymodule
5.4.8 PythonAutoReload
5.4.9 PythonOptimize
5.4.10 PythonOption
5.4.11 PythonPath
The path specified in this directive will replace the path, not add to it. However, because the value of the directive is
evaled, to append a directory to the path, one can specify something like
PythonPath "sys.path+[’/mydir’]"
Mod python tries to minimize the number of evals associated with the PythonPath directive because evals are slow
and can negatively impact performance, especially when the directive is specified in an ‘.htaccess’ file which gets
parsed at every hit. Mod python will remember the arguments to the PythonPath directive in the un-evaled form,
and before evaling the value it will compare it to the remembered value. If the value is the same, no action is taken.
Because of this, you should not rely on the directive as a way to restore the pythonpath to some value if your code
changes it.
Note: This directive should not be used as a security measure since the Python path is easily manipulated from within
SIX
It is formatted like an HTML comment, so if you don’t have SSI correctly enabled, the browser will ignore it, but it
will still be visible in the HTML source. If you have SSI correctly configured, the directive will be replaced with its
results.
For a more thorough description of the SSI mechanism and how to enable it, see the SSI tutorial provided with the
Apache documentation.
Version 3.3 of mod python introduces support for using Python code within SSI files. Note that mod python honours
the intent of the Apache IncludesNOEXEC option to the Options directive. That is, if IncludesNOEXEC is
enabled, then Python code within a SSI file will not be executed.
Where the result of the expression is not a string, the value will be automatically converted to a string by applying
str() to the value.
In the case of ’exec’ a block of Python code may be included. For any output from this code to appear in the page, it
must be written back explicitly as being part of the response. As SSI are processed by an Apache output filter, this is
done by using an instance of the mod python filter object which is pushed into the global data set available to the
code.
75
<!--#python exec="
filter.write(10*’HELLO ’)
filter.write(str(len(’HELLO’)))
" -->
Any Python code within the ’exec’ block must have a zero first level indent. You cannot start the code block with an
arbitrary level of indent such that it lines up with any indenting used for surrounding HTML elements.
Although the mod python filter object is not a true file object, that it provides the write() method is sufficient
to allow the print statement to be used on it directly. This will avoid the need to explicitly convert non string objects
to a string before being output.
<!--#python exec="
print >> filter, len(’HELLO’)
" -->
<!--#python exec="
import cgi, time, os
def _escape(object):
return cgi.escape(str(object))
now = time.time()
" -->
<html>
<body>
<pre>
<!--#python eval="_escape(time.asctime(time.localtime(now)))"-->
<!--#python exec="
keys = os.environ.keys()
keys.sort()
for key in keys:
print >> filter, _escape(key),
print >> filter, ’=’,
print >> filter, _escape(repr(os.environ.get(key)))
" -->
</pre>
</body>
</html>
The lifetime of any global data is for the current request only. If data must persist between requests, it must reside in
external modules and as necessary be protected against multithreaded access in the event that a multithreaded Apache
MPM is used.
PythonFixupHandler _handlers
The implementation of the fixup handler contained in handlers.py then needs to create an instance of a Python
dictionary, store that in the mod python request object as ssi globals and then populate that dictionary with any
data to be available to the Python code executing within the page.
def _escape(object):
return cgi.escape(str(object))
def _header(filter):
print >> filter, ’...’
def _footer(filter):
print >> filter, ’...’
def fixuphandler(req):
req.ssi_globals = {}
req.ssi_globals[’time’] = time
req.ssi_globals[’_escape’] = _escape
req.ssi_globals[’_header’] = _header
req.ssi_globals[’_footer’] = _footer
return apache.OK
This is most useful where it is necessary to insert common information such as headers, footers or menu panes which
are dynamically generated into many pages.
<!--#python exec="
now = time.time()
" -->
<html>
<body>
<!--#python exec="_header(filter)" -->
<pre>
<!--#python eval="_escape(time.asctime(time.localtime(now)))"-->
</pre>
<!--#python exec="_footer(filter)" -->
</body>
</html>
A test condition can be any sort of logical comparison, either comparing values to one another, or testing the ’truth’ of
a particular value.
The source of variables used in conditional expressions is distinct from the set of global data used by the Python
code executed within a page. Instead, the variables are sourced from the subprocess env table object contained
within the request object. The values of these variables can be set from within a page using the SSI ’set’ directive, or
by a range of other Apache directives within the Apache configuration files such as BrowserMatchNoCase and
SetEnvIf.
To set these variables from within a mod python handler, the subprocess env table object would be manipulated
directly through the request object.
def fixuphandler(req):
debug = req.get_config().get(’PythonDebug’, ’0’)
req.subprocess_env[’DEBUG’] = debug
return apache.OK
If being done from within Python code contained within the page itself, the request object would first have to be
accessed via the filter object.
<!--#python exec="
debug = filter.req.get_config().get(’PythonDebug’, ’0’)
filter.req.subprocess_env[’DEBUG’] = debug
" -->
<html>
<body>
<!--#if expr="${DEBUG} != 0" -->
DEBUG ENABLED
<!--#else -->
DEBUG DISABLED
<!--#endif -->
</body>
</html>
When mod python is being used, the ability to dynamically enable output filters for the current request can instead
be used. This could be done just for where the request maps to a static file, but may just as easily be carried out
where the content of a response is generated dynamically. In either case, to enable SSI for the current request, the
add output filter() method of the mod python request object would be used.
def fixuphandler(req):
req.add_output_filter(’INCLUDES’)
return apache.OK
SEVEN
Standard Handlers
7.1.1 Introduction
To use the handler, you need the following lines in your configuration
<Directory /some/path>
SetHandler mod_python
PythonHandler mod_python.publisher
</Directory>
This handler allows access to functions and variables within a module via URL’s. For example, if you have the
following module, called ‘hello.py’:
The Publisher handler maps a URI directly to a Python variable or callable object, then, respectively, returns it’s string
representation or calls it returning the string representation of the return value.
Traversal
The Publisher handler locates and imports the module specified in the URI. The module location is determined from
the req.filename attribute. Before importing, the file extension, if any, is discarded.
81
If req.filename is empty, the module name defaults to ‘index’.
Once module is imported, the remaining part of the URI up to the beginning of any query data (a.k.a. PATH INFO)
is used to find an object within the module. The Publisher handler traverses the path, one element at a time from left
to right, mapping the elements to Python object within the module.
If no path info was given in the URL, the Publisher handler will use the default value of ‘index’. If the last element
is an object inside a module, and the one immediately preceding it is a directory (i.e. no module name is given), then
the module name will also default to ‘index’.
The traversal will stop and HTTP NOT FOUND will be returned to the client if:
• Any of the traversed object’s names begin with an underscore (‘ ’). Use underscores to protect objects that
should not be accessible from the web.
• A module is encountered. Published objects cannot be modules for security reasons.
If an object in the path could not be found, HTTP NOT FOUND is returned to the client.
For example, given the following configuration:
DocumentRoot /some/dir
<Directory /some/dir>
SetHandler mod_python
PythonHandler mod_python.publisher
</Directory>
def index(req):
return "We are in index()"
def hello(req):
return "We are in hello()"
Then:
http://www.somehost/index/index will return ‘We are in index()’
http://www.somehost/index/ will return ‘We are in index()’
http://www.somehost/index/hello will return ‘We are in hello()’
http://www.somehost/hello will return ‘We are in hello()’
http://www.somehost/spam will return ‘404 Not Found’
Once the destination object is found, if it is callable and not a class, the Publisher handler will get a list of arguments
that the object expects. This list is compared with names of fields from HTML form data submitted by the client
via POST or GET. Values of fields whose names match the names of callable object arguments will be passed as
strings. Any fields whose names do not match the names of callable argument objects will be silently dropped, unless
the destination callable object has a **kwargs style argument, in which case fields with unmatched names will be
passed in the **kwargs argument.
Authentication
The publisher handler provides simple ways to control access to modules and functions.
At every traversal step, the Publisher handler checks for presence of auth and access attributes (in this
order), as well as auth realm attribute.
If auth is found and it is callable, it will be called with three arguments: the Request object, a string
containing the user name and a string containing the password. If the return value of auth is false, then
HTTP UNAUTHORIZED is returned to the client (which will usually cause a password dialog box to appear).
If auth is a dictionary, then the user name will be matched against the key and the password against the value
associated with this key. If the key and password do not match, HTTP UNAUTHORIZED is returned. Note that this
requires storing passwords as clear text in source code, which is not very secure.
auth can also be a constant. In this case, if it is false (i.e. None, 0, "", etc.), then HTTP UNAUTHORIZED is
returned.
If there exists an auth realm string, it will be sent to the client as Authorization Realm (this is the text that
usually appears at the top of the password dialog box).
If access is found and it is callable, it will be called with two arguments: the Request object and a string
containing the user name. If the return value of access is false, then HTTP FORBIDDEN is returned to the
client.
If access is a list, then the user name will be matched against the list elements. If the user name is not in the
list, HTTP FORBIDDEN is returned.
Similarly to auth , access can be a constant.
In the example below, only user ‘eggs’ with password ‘spam’ can access the hello function:
def hello(req):
return "hello"
def hello(req):
return "hello"
Since functions cannot be assigned attributes, to protect a function, an auth or access function can be
defined within the function, e.g.:
def sensitive(req):
Note that this technique will also work if auth or access is a constant, but will not work is they are a
dictionary or a list.
The auth and access mechanisms exist independently of the standard PythonAuthenHandler. It is
possible to use, for example, the handler to authenticate, then the access list to verify that the authenticated
user is allowed to a particular function.
Note: In order for mod python to access auth , the module containing it must first be imported. Therefore, any
module-level code will get executed during the import even if auth is false. To truly protect a module from
being accessed, use other authentication mechanisms, e.g. the Apache mod auth or with a mod python PythonAu-
thenHandler handler.
In the process of matching arguments, the Publisher handler creates an instance of FieldStorage class. A reference to
this instance is stored in an attribute form of the Request object.
Since a FieldStorage can only be instantiated once per request, one must not attempt to instantiate
FieldStorage when using the Publisher handler and should use Request.form instead.
Note: Leaving debug on in a production environment will allow remote users to display source code of your PSP
pages!
SetHandler mod_python
PythonHandler mod_python.cgihandler
As of version 2.7, the cgihandler will properly reload even indirectly imported module. This is done by saving a list of
loaded modules (sys.modules) prior to executing a CGI script, and then comparing it with a list of imported modules
after the CGI script is done. Modules (except for whose whose file attribute points to the standard Python library
location) will be deleted from sys.modules thereby forcing Python to load them again next time the CGI script imports
them.
If you do not want the above behavior, edit the ‘cgihandler.py’ file and comment out the code delimited by ###.
Tests show the cgihandler leaking some memory when processing a lot of file uploads. It is still not clear what causes
this. The way to work around this is to set the Apache MaxRequestsPerChild to a non-zero value.
EIGHT
Security
Considerations on using mod python in a secure manner can be found in the mod python wiki at CategorySecurity.
87
88
APPENDIX
New Features
89
• (MODPYTHON-170) Added req. request rec, server. server rec and conn. conn rec
semi private members for getting accessing to underlying Apache struct as a Python CObject. These can be
used for use in implementing SWIG bindings for lower level APIs of Apache. These members should be re-
garded as experimental and there are no guarantees that they will remain present in this specific form in the
future.
• (MODPYTHON-193) Added new attribute available as req.hlist.location. For a handler executed di-
rectly as the result of a handler directive within a Location directive, this will be set to the value of the
Location directive. If LocationMatch, or wildcards or regular expressions are used with Location, the
value will be the matched value in the URL and not the pattern.
Improvements
• (MODPYTHON-27) When using mod python.publisher, the auth () and access () functions
and the auth realm string can now be nested within a class method as a well a normal function.
• (MODPYTHON-90) The PythonEnablePdb configuration option will now be ignored if Apache hasn’t been
started up in single process mode.
• (MODPYTHON-91) If running Apache in single process mode with PDB enabled and the ”quit” command is
used to exit that debug session, an exception indicating that the PDB session has been aborted is raised rather
than None being returned with a subsequent error complaining about the handler returning an invalid value.
• (MODPYTHON-93) Improved util.FieldStorage efficiency and made the interface more dictionary like.
• (MODPYTHON-101) Force an exception when handler evaluates to something other than None but is otherwise
not callable. Previously an exception would not be generated if the handler evaluated to False.
• (MODPYTHON-107) Neither mod python.publisher nor mod python.psp explicitly flush output after writing
the content of the response back to the request object. By not flushing output it is now possible to use the
”CONTENT LENGTH” output filter to add a ”Content-Length” header.
• (MODPYTHON-111) Note made in session documentation that a save is required to avoid session timeouts.
• (MODPYTHON-125) The req.handler attribute is now writable. This allows a handler executing in a phase
prior to the response phase to specify which Apache module will be responsible for generating the content.
• (MODPYTHON-128) Made the req.canonical filename attribute writable. Changed the req.finfo
attribute from being a tuple to an actual object. For backwards compatibility the attributes of the object can
still be accessed as if they were a tuple. New code however should access the attributes as member data. The
req.finfo attribute is also now writable and can be assigned to using the result of calling the new function
apache.stat(). This function is a wrapper for apr stat().
• (MODPYTHON-129) When specifying multiple handlers for a phase, the status returned by each handler
is now treated the same as how Apache would treat the status if the handler was registered using the low
level C API. What this means is that whereas stacked handlers of any phase would in turn previously be
executed as long as they returned apache.OK, this is no longer the case and what happens is depen-
dent on the phase. Specifically, a handler returning apache.DECLINED no longer causes the execution
of subsequent handlers for the phase to be skipped. Instead, it will move to the next of the stacked han-
dlers. In the case of PythonTransHandler, PythonAuthenHandler, PythonAuthzHandler and
PythonTypeHandler, as soon as apache.OK is returned, subsequent handlers for the phase will be
skipped, as the result indicates that any processing pertinent to that phase has been completed. For other phases,
stacked handlers will continue to be executed if apache.OK is returned as well as when apache.DECLINED
is returned. This new interpretation of the status returned also applies to stacked content handlers listed against
the PythonHandler directive even though Apache notionally only ever calls at most one content handler.
Where all stacked content handlers in that phase run, the status returned from the last handler becomes the
overall status from the content phase.
Bug Fixes
• (MODPYTHON-38) Fixed issue when using PSP pages in conjunction with publisher handler or where a PSP
error page was being triggered, that form parameters coming from content of a POST request weren’t available
or only available using a workaround. Specifically, the PSP page will now use any FieldStorage object
instance cached as req.form left there by preceding code.
• (MODPYTHON-43) Nested auth () functions in mod python.publisher now execute in context of glob-
als from the file the function is in and not that of mod python.publisher itself.
• (MODPYTHON-47) Fixed mod python.publisher so it will not return a HTTP Bad Request response when
mod auth is being used to provide Digest authentication.
91
• (MODPYTHON-63) When handler directives are used within Directory or DirectoryMatch directives
where wildcards or regular expressions are used, the handler directory will be set to the shortest directory
matched by the directory pattern. Handler directives can now also be used within Files and FilesMatch
directives and the handler directory will correctly resolve to the directory corresponding to the enclosing
Directory or DirectoryMatch directive, or the directory the .htaccess file is contained in.
• (MODPYTHON-76) The FilterDispatch callback should not flush the filter if it has already been closed.
• (MODPYTHON-84) The original change to fix the symlink issue for req.sendfile() was causing problems
on Win32, plus code needed to be changed to work with APR 1.2.7.
• (MODPYTHON-100) When using stacked handlers and a SERVER RETURN exception was used to return an
OK status for that handler, any following handlers weren’t being run if appropriate for the phase.
• (MODPYTHON-109) The Py Finalize() function was being called on child process shutdown. This was
being done though from within the context of a signal handler, which is generally unsafe and would cause the
process to lock up. This function is no longer called on child process shutdown.
• (MODPYTHON-112) The req.phase attribute is no longer overwritten by an input or output filter. The
filter.is input member should be used to determine if a filter is an input or output filter.
• (MODPYTHON-113) The PythonImport directive now uses the apache.import module() function
to import modules to avoid reloading problems when same module is imported from a handler.
• (MODPYTHON-114) Fixed race conditions on setting sys.path when the PythonPath directive is being
used as well as problems with infinite extension of path.
• (MODPYTHON-120) (MODPYTHON-121) Fixes to test suite so it will work on virtual hosting environments
where localhost doesn’t resolve to 127.0.0.1 but the actual IP address of the host.
• (MODPYTHON-126) When Python*Handler or Python*Filter directive is used inside of a Files
directive container, the handler/filter directory value will now correctly resolve to the directory corresponding to
any parent Directory directive or the location of the .htaccess file the Files directive is contained in.
• (MODPYTHON-133) The table object returned by req.server.get config() was not being populated
correctly to be the state of directives set at global scope for the server.
• (MODPYTHON-134) Setting PythonDebug to Off, wasn’t overriding On setting in parent scope.
• (MODPYTHON-140) The util.redirect() function should be returning server status of apache.DONE
and not apache.OK otherwise it will not give desired result if used in non content handler phase or where
there are stacked content handlers.
• (MODPYTHON-147) Stopped directories being added to sys.path multiple times when PythonImport
and PythonPath directive used.
• (MODPYTHON-148) Added missing Apache contants apache.PROXYREQ RESPONSE and
apache.HTTP UPGRADE REQUIRED. Also added new constants for Apache magic mime types and
values for interpreting the req.connection.keepalive and req.read body members.
• (MODPYTHON-150) In a multithread MPM, the apache.init() function could be called more than once
for a specific interpreter instance whereas it should only be called once.
• (MODPYTHON-151) Debug error page returned to client when an exception in a handler occurred wasn’t es-
caping special HTML characters in the traceback or the details of the exception.
• (MODPYTHON-157) Wrong interpreter name used for fixup handler phase and earlier, when
PythonInterpPerDirectory was enabled and request was against a directory but client didn’t provide
the trailing slash.
• (MODPYTHON-159) Fix FieldStorage class so that it can handle multiline headers.
93
94
APPENDIX
New Features
Improvements
• (MODPYTHON-77) Third party C modules that use the simplified API for the Global Interpreter Lock (GIL),
as described in PEP 311, can now be used. The only requirement is that such modules can only be used in the
context of the ‘main interpreter’.
• (MODPYTHON-119) DbmSession unit test no longer uses the default directory for the dbm file, so the test will
not interfer with the user’s current apache instance.
• (MODPYTHON-158) Added additional debugging and logging output for where mod python cannot initialise
itself properly due to Python or mod python version mismatches or missing Python module code files.
Bug Fixes
95
96
APPENDIX
Security Fix
• (MODPYTHON-135) Fixed possible directory traversal attack in FileSession. The session id is now checked to
ensure it only contains valid characters. This check is performed for all sessions derived from the BaseSession
class.
97
98
APPENDIX
New Features
Improvements
• Autoreload of a module using apache.import module() now works if modification time for the module
is different from the file. Previously, the module was only reloaded if the the modification time of the file was
more recent. This allows for a more graceful reload if a file with an older modification time needs to be restored
from backup.
• Fixed the publisher traversal security issue
• Objects hierarchy a la CherryPy can now be published.
• mod python.c now logs reason for a 500 error
• Calls to PyErr Print in mod python.c are now followed by fflush()
• Using an empty value with PythonOption will unset a PythonOption key.
• req.path info is now a read/write member.
• Improvements to FieldStorage allow uploading of large files. Uploaded files are now streamed to disk, not to
memory.
• Path to flex is now discovered at configuration time or can be specifed using configure
--with-flex=/path/to/flex.
• sys.argv is now initialized to ["mod python"] so that modules like numarray and pychart can work
properly.
99
Bug Fixes
• Fixed memory leak which resulted from circular references starting from the request object.
• Fixed memory leak resulting from multiple PythonOption directives.
• Fixed Multiple/redundant interpreter creation problem.
• Cookie attributes with attribute names prefixed with $ are now ignored. See Section 4.7 for more information.
• Bug in setting up of config dir from Handler directives fixed.
• mod python.publisher will now support modules with the same name but in different directories
• Fixed continual reloading of modules problem
• Fixed big marshalled cookies error.
• Fixed mod python.publisher extension handling
• mod python.publisher default index file traversal
• mod python.publisher loading wrong module and giving no warning/error
• apply fs data() now works with ”new style” objects
• File descriptor fd closed after ap send fd() in req sendfile()
• Bug in mem cleanup in MemorySession fixed.
• Fixed bug in apache. global lock() which could cause a segfault if the lock index parameter is greater
number of mutexes created at mod python startup.
• Fixed bug where local ip and local host in connection object were returning remote ip and
remote host instead
• Fixed install dso Makefile rule so it only installs the dso, not the python files
• Potential deadlock in psp cache handling fixed
• Fixed bug where sessions are used outside ¡Directory¿ directive.
• Fixed compile problem on IRIX. ln -s requires both TARGET and LINK NAME on IRIX. ie. ln -s
TARGET LINK NAME
• Fixed ./configure problem on SuSE Linux 9.2 (x86-64). Python libraries are in lib64/ for this platform.
• Fixed req.sendfile() problem where sendfile(filename) sends the incorrect number of bytes
when filename is a symlink.
• Fixed problem where util.FieldStorage was not correctly checking the mime types of POSTed entities
• Fixed conn.local addr and conn.remote addr for a better IPv6 support.
• Fixed psp parser.l to properly escape backslash-n, backslash-t and backslash-r character
sequences.
• Fixed segfault bug when accessing some request object members (allowed methods, allowed xmethods, con-
tent languages) and some server object members (names, wild names).
• Fixed request.add handler() segfault bug when adding a handler to an empty handler list.
• Fixed PythonAutoReload directive so that AutoReload can be turned off.
• Fixed connection object read() bug on FreeBSD.
• Fixed potential buffer corruption bug in connection object read().
• Mod python 3.0 no longer works with Apache 1.3, only Apache 2.x is supported.
• Mod python no longer works with Python versions less than 2.2.1
• Mod python now supports Apache filters.
• Mod python now supports Apache connection handlers.
• Request object supports internal redirect().
• Connection object has read(), readline() and write().
• Server object has get config().
• Httpdapi handler has been deprecated.
• Zpublisher handler has been deprecated.
• Username is now in req.user instead of req.connection.user
101
102
INDEX
Symbols B
./configure, 6 base server (connection attribute), 42
--with-apxs, 6 BaseSession (class in Session), 54
--with-flex, 7 bytes sent (request attribute), 39
--with-max-locks, 6
--with-mutex-dir, 6 C
--with-python-src, 7 canonical filename (request attribute), 40
--with-python, 6 CGI, 85
apache Changes from
module, 24 version 2.x, 101
version 3.1.4, 99
A version 3.2.10, 89
aborted (connection attribute), 42 version 3.2.7, 97
add() (table method), 33 version 3.2.8, 95
add common vars() (request method), 33 chunked (request attribute), 39
add cookie() (in module Cookie), 51 cleanup() (BaseSession method), 55
add field() (FieldStorage method), 47 clear() (FieldStorage method), 47
add handler() (request method), 33 clength (request attribute), 39
add input filter() (request method), 33 close() (filter method), 43
add output filter() (request method), 34 closed (filter attribute), 43
allow methods() compiling
in module apache, 30 mod python, 5
request method, 34 config tree() (in module apache), 31
allowed (request attribute), 39 connection
allowed methods (request attribute), 39 handler, 23
allowed xmethods (request attribute), 39 object, 41
ap auth type (request attribute), 40 connection (request attribute), 38
apache (extension module), 23 construct url() (request method), 34
apache configuration content encoding (request attribute), 40
LoadModule, 8 content languages (request attribute), 40
mutex directory, 8 content type (request attribute), 40
mutex locks, 8 Cookie
apply data() (PSPInterface method), 61 class in Cookie, 50
apxs, 6 extension module, 50
args (request attribute), 41 created() (BaseSession method), 55
assbackwards (request attribute), 38
auth name() (request method), 34 D
AUTH TYPE, 40 DbmSession (class in Session), 55
auth type() (request method), 34 defn line number (server attribute), 45
defn name (server attribute), 44
delete() (BaseSession method), 55
103
disable() (filter method), 43 get options()
discard request body() (request method), 34 request method, 35
display code() (PSP method), 60 server method, 44
disposition (Field attribute), 49 get remote host() (request method), 34
disposition options (Field attribute), 49 getfirst() (FieldStorage method), 47
document root() (request method), 34 getlist() (FieldStorage method), 47
double reverse (connection attribute), 43
H
E handler, 14
environment variables connection, 23
AUTH TYPE, 40 filter, 22
PATH INFO, 41 request, 20
PATH, 6 handler
QUERY ARGS, 41 filter attribute, 44
REMOTE ADDR, 42 request attribute, 40
REMOTE HOST, 42 has key() (FieldStorage method), 47
REMOTE IDENT, 42 header only (request attribute), 38
REMOTE USER, 40 headers in (request attribute), 39
REQUEST METHOD, 38 headers out (request attribute), 39
SERVER NAME, 45 hostname (request attribute), 38
SERVER PORT, 45 httpdapi, 101
SERVER PROTOCOL, 38 Httpdapy, 101
eos sent (request attribute), 41
err headers out (request attribute), 39 I
error fname (server attribute), 45 id() (BaseSession method), 55
exists config define() (in module apache), id (connection attribute), 43
31 import module() (in module apache), 24
expecting 100 (request attribute), 39 init lock() (BaseSession method), 55
install dso
F make targets, 7
Field (class in util), 49 install py lib
FieldStorage (class in util), 46 make targets, 7
file (Field attribute), 49 installation
filename UNIX, 5
Field attribute, 49 internal redirect() (request method), 35
request attribute, 40 interpreter
FileSession (class in Session), 56 apache attribute, 32
filter request attribute, 40
handler, 22 invalidate() (BaseSession method), 55
object, 43 is https() (request method), 35
finfo (request attribute), 41 is input (filter attribute), 44
flex, 6 is new() (BaseSession method), 54
flush() is virtual (server attribute), 45
filter method, 43 items() (FieldStorage method), 47
request method, 37
K
G keep alive (server attribute), 45
get() (FieldStorage method), 47 keep alive max (server attribute), 45
get basic auth pw() (request method), 34 keep alive timeout (server attribute), 45
get config() keepalive (connection attribute), 42
request method, 34 keepalives (connection attribute), 43
server method, 44 keys() (FieldStorage method), 47
get cookie() (in module Cookie), 52
get cookies() (in module Cookie), 52
104 Index
L next (request attribute), 38
last accessed() (BaseSession method), 55 no cache (request attribute), 40
libpython.a, 6 no local copy (request attribute), 40
limit req fields (server attribute), 45 notes
limit req fieldsize (server attribute), 45 connection attribute, 43
limit req line (server attribute), 45 request attribute, 39
list (FieldStorage attribute), 47
load() (BaseSession method), 55 O
LoadModule object
apache configuration, 8 connection, 41
local addr (connection attribute), 42 filter, 43
local host (connection attribute), 43 request, 20
local ip (connection attribute), 43 server, 44
lock() (BaseSession method), 55 table, 32
log error() order
connection method, 41 phase, 64
in module apache, 24
request method, 35 P
server method, 44 parse()
loglevel (server attribute), 45 Cookie method, 51
in module psp, 61
M SignedCookie method, 51
mailing list parse qs() (in module util), 49
mod python, 5 parse qsl() (in module util), 49
main (request attribute), 38 parsed uri (request attribute), 41
main server (apache attribute), 32 parsestring() (in module psp), 61
make targets pass on() (filter method), 43
install dso, 7 PATH, 6
install py lib, 7 path (server attribute), 45
make table() (in module apache), 31 PATH INFO, 41
MarshalCookie (class in Cookie), 51 path info (request attribute), 40
meets conditions() (request method), 35 pathlen (server attribute), 45
MemorySession (class in Session), 57 phase
method (request attribute), 38 order, 64
method number (request attribute), 38 phase (request attribute), 39
mod python port (server attribute), 45
compiling, 5 prev (request attribute), 38
mailing list, 5 proto num (request attribute), 38
mod python.so, 8 protocol (request attribute), 38
module proxyreq (request attribute), 38
apache, 24 PSP, 84
mpm query() (in module apache), 31 PSP (class in psp), 59
mtime (request attribute), 39 psp (extension module), 58
mutex directory PSPInterface (class in psp), 60
apache configuration, 8 Python*Handler Syntax, 63
mutex locks python-src, 7
apache configuration, 8 PythonAccessHandler, 65
PythonAuthenHandler, 65
N PythonAuthzHandler, 66
name PythonAutoReload, 71
Field attribute, 49 PythonCleanupHandler, 67
filter attribute, 43 PythonConnectionHandler, 68
names (server attribute), 45 PythonDebug, 69
PythonEnablePdb, 69
Index 105
PythonFixupHandler, 66 remote logname (connection attribute), 42
PythonHandler, 67 REMOTE USER, 40
PythonHandlerModule, 71 req, 20
PythonHeaderParserHandler, 64 req (filter attribute), 44
PythonImport, 69 request, 33
PythonInitHandler, 65 handler, 20
PythonInputFilter, 67 object, 20
PythonInterpPerDirectory, 70 REQUEST METHOD, 38
PythonInterpreter, 71 request time (request attribute), 38
PythonLogHandler, 67 requires() (request method), 36
PythonOptimize, 72 RFC
PythonOption, 72 RFC 1867, 49
PythonOutputFilter, 68 RFC 2109, 50
PythonPath, 73 RFC 2964, 50
PythonPostReadRequestHandler, 64 RFC 2965, 50
PythonPythonInterpPerDirective, 70 run() (PSP method), 60
PythonTransHandler, 64
PythonTypeHandler, 66 S
save() (BaseSession method), 55
Q sendfile() (request method), 37
QUERY ARGS, 41 sent bodyct (request attribute), 39
server
R object, 44
range (request attribute), 39 server (request attribute), 38
read() server admin (server attribute), 45
connection method, 42 server hostname (server attribute), 45
filter method, 43 SERVER NAME, 45
request method, 36 SERVER PORT, 45
read body (request attribute), 39 SERVER PROTOCOL, 38
read chunked (request attribute), 39 server root() (in module apache), 31
read length (request attribute), 39 Session() (in module Session), 53
readline() Session (extension module), 53
connection method, 42 set content length() (request method), 38
filter method, 43 set error page() (PSPInterface method), 60
request method, 36 set etag() (request method), 37
readlines() (request method), 36 set last modified() (request method), 37
redirect() set timeout() (BaseSession method), 55
in module util, 50 SignedCookie (class in Cookie), 51
PSPInterface method, 61 ssl var lookup() (request method), 37
register cleanup() stat() (in module apache), 31
in module apache, 31 status (request attribute), 38
request method, 36 status line (request attribute), 38
server method, 44 subprocess env (request attribute), 39
register input filter() (request method),
36 T
register output filter() (request method), table, 32
37 object, 32
remaining (request attribute), 39 table (class in apache), 32
REMOTE ADDR, 42 the request (request attribute), 38
remote addr (connection attribute), 42 timeout() (BaseSession method), 55
REMOTE HOST, 42 timeout (server attribute), 45
remote host (connection attribute), 42 type (Field attribute), 49
REMOTE IDENT, 42 type options (Field attribute), 49
remote ip (connection attribute), 42
106 Index
U
UNIX
installation, 5
unlock() (BaseSession method), 55
unparsed uri (request attribute), 40
update mtime() (request method), 37
uri (request attribute), 40
used path info (request attribute), 41
user (request attribute), 40
util (extension module), 45
V
value (Field attribute), 49
version 2.x
Changes from, 101
version 3.1.4
Changes from, 99
version 3.2.10
Changes from, 89
version 3.2.7
Changes from, 97
version 3.2.8
Changes from, 95
vlist validator (request attribute), 40
W
wild names (server attribute), 45
write()
connection method, 42
filter method, 43
request method, 37
Z
ZPublisher, 101
Index 107