"Mod_python is an Apache module that embeds the Python interpreter within the server. With mod_python you can write web-based applications in Python that will run many times faster than traditional CGI and will have access to advanced features such as ability to retain database connections and other data between hits and access to Apache internals. " (in http://www.modpython.org/)
Originally PyWPS was prepared to be run only as a CGI script, therefore it can be run from the bash, currently (August 2010) the SVN version (version 3.2) provides a wps.py script designed to be integrated into mod_python, this script is in the webservice folder of the SVN tree:
So what are the reasons to use it ?!
- Python embedded into Apache
- Ability to handle request phases, filters and connections
- Interface with Apache API
Python embedded into Apache is the basic definition of mod_python, basically the Python interpreter is loaded into Apache as a module, this allows for Python code to be run internally by Apache and not spawn as a new process that will output to the client/browser.
So what is the advantage ?! SPEED !!!!! As indicated in the mod_python documentation a simple CGI versus mod_python shows that mod_python can handle 50x more requests than an identical code run as CGI.
Before a client request actually reaches the state where the python code is run (either as CGI or mod_python), it has to pass thru several filters, metadata processing and logging. If a code is running as CGI it is complicated to hack/interact with Apache's request process, if the code is run using mod_python, there is an easy access to the request process and it is relatively easy to retrieve data and manipulate the request to the most convenient way. Basically mod_python provides a friendly access to Apache's API/Functionalities.
Mod_python documentation has detailed information on how to install it (here]) , the suggested install is from source-code, but there is no problems on using a package install.
Assuming that mod_python is proper installed, the PyWPS will be installed as a handler, meaning wps.py will be the default script to be run. The default configuration should be inside the <Directory> section of httpd.conf or in the .htaccess file (please consult Apache's documentation for more information. In the following example we have a DocumentRoot of "/var/www/html" (defined in another section of the httpd.conf file) and the folder /wps will contain the wps.py script
<Directory "/var/www/html/wps"> AddHandler mod_python .py Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all PythonHandler wps PythonDebug On PythonAutoReload On PythonOption PYWPS_PROCESSES /usr/lib/python2.6/site-packages/pywps/processes PythonOption PYWPS_CFG /etc/pywps.cfg </Directory>
The mod_python configuration will work more or less like he wrapper script used in the CGI procedure. PyWPS process location and configuration files are still passed as parameters but in this case as mod_python's PythonOption.
One important note when working with mod_python: Setting PYWPS_PROCESS and PYWPS_CFG as environment variables inside Apache will not work:
SetEnv PYWPS_PROCESSES /usr/lib/python2.6/site-packages/pywps/processes SetEnv PYWPS_CFG /etc/pywps.cfg
Mod_python will not "see" environment variables, only variables passed using PythonOption and then fetched from the request object will be used by the script
The AddHandler options informs Apache that any extention .py file should be processed by mod_python, PythonHandler points to the file that will accept the HTTP request, when using the PythonHandler it is only possible to have one handler per URL, even if the URL doesn't indicate the script, mod_python will assume that wps.py will process the request.
- PythonDebug and PythonAutoReload are used in development environments, PythonDebug allows for debug message to returned to the client and PythonAutoReload checks for changes in the python code and reloads any new code, basically it tries to avoid caching of compiled code. In production enviroments is not advisable such options, better to use:
PythonDebug Off PythonOptimize On
Also, one advantage of using mod_python is that PYTHONPATH is very simple to append (change):
In this case a pywps package in some other directory has been append to the PYTHONPATH
Example of basic authentication
Following the simple authentication example shown in mod_python documentation, a WPS could be protected with the following httpd.conf:
<Directory "/var/www/html/mod_python"> AddHandler mod_python .py Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all PythonHandler wps PythonAuthenHandler wps PythonDebug On PythonAutoReload On AuthType Basic AuthName "Restricted Area" require valid-user </Directory>
The PythonAuthenHandler indicates that wps.py contains function authenhandler() that will do some sort of authentication, if things go ok then the handler() function will deal with request. It should be enough to add the following code in to the wps.py:
def authenhandler(req): pw=req.get_basic_auth_pw() user=req.user if (user=="bacon" and pw=="eggs"): return apache.OK else: return apache.HTTP_UNAUTHORIZED
The authenhandler() gets the password and user information from the request object, checks if the user is "bacon" and the password is eggs, if so apache will continue processing the request (apache.OK) otherwise it will instruct Apache to return a HTTP Unauthorized access.
Another advantage of mod_python is the use of filters on the body of a HTTP request or response. Filter are simple to set but they can be a bit "tricky" since they don't have knowledge of when they are called of if they are called in the main request or some sub request. Normally it is necessary to make them context-sensitive.
Probably the simplest filter for WPS is an out filter that will encrypt the XML content. To start is is necessary to register the filter to mod_python and apache, inside the directory that contains the mod_python definiton:
PythonOutputFilter wps ENCRPYT AddOutputFilter ENCRYPT .py
This says: We have a filter called ENCRYPT inside the handler wps (meaning inside file wps.py) the function in the handler is called outputfilter (PythonOutputFilter). then we add filter ENCRYPT (AddOutputFilter) that should be applied to all files with extension .py
The AddOutputFilter is defined in the mod_mine meaning that we can apply the filters to a file extension, or mimetype, for example we could have used the following filter that would be applied to any XML content:
AddOutputFilterByType ENCRYPT text/xml
Now that we have the filter set we need to include the outputfilter function in the wps.py file:
from itertools import izip, cycle def xor_crypt_string(data, key): return ''.join(chr(ord(x) ^ ord(y)) for (x,y) in izip(data, cycle(key))) def outputfilter(filter): req=filter.req #getting requirement s = filter.read() #You have always to read and then write the filter (no matter what) if req.status == apache.HTTP_OK: s_crpyt=xor_crypt_string(s, "FOSS4G") filter.write(s_crpyt) else: filter.write(s) filter.close() #Always close the stream otherwise it will write twice
Here we will use again the function xor_crypt_string() to make a simple XOR encryption.The filter will only encrypt the content if the HTTP is code 200, this way any error message or problem with not be encrypted and will be outputted as simple text
Above we have, the output filter receives the filter object that contains the HTTP request as one of its members, then the filter is read. If the HTTP request is 200 then it will encrypt the HTTP body content, after it will write everything to the filter. If the HTTP request is not 200 then it will just write the content back. Finally it will close the filter.
In the end mod_python allows for a considerable manipulation of all environment outside the wps.py script, the advantage is that we just have the same script that doesn't need major changes to implement "exotic features" like request checks, filters, encryption.
--Wikiadmin 16:11, 10 January 2011 (UTC)