Mod python

From PyWPS
Jump to: navigation, search

Mod_python

"Mod_python is an Apache module that embeds the Python interpreter within the server. With mod_python you can write web-based applications in Python that will run many times faster than traditional CGI and will have access to advanced features such as ability to retain database connections and other data between hits and access to Apache internals. " (in http://www.modpython.org/)

Originally PyWPS was prepared to be run only as a CGI script, therefore it can be run from the bash, currently (August 2010) the SVN version (version 3.2) provides a wps.py script designed to be integrated into mod_python, this script is in the webservice folder of the SVN tree:

https://svn.wald.intevation.org/svn/pywps/trunk/webservices/mod_python/wps.py

So what are the reasons to use it ?!

  • Python embedded into Apache
  • Ability to handle request phases, filters and connections
  • Interface with Apache API

Python embedded into Apache is the basic definition of mod_python, basically the Python interpreter is loaded into Apache as a module, this allows for Python code to be run internally by Apache and not spawn as a new process that will output to the client/browser.

So what is the advantage ?! SPEED !!!!! As indicated in the mod_python documentation a simple CGI versus mod_python shows that mod_python can handle 50x more requests than an identical code run as CGI.

Before a client request actually reaches the state where the python code is run (either as CGI or mod_python), it has to pass thru several filters, metadata processing and logging. If a code is running as CGI it is complicated to hack/interact with Apache's request process, if the code is run using mod_python, there is an easy access to the request process and it is relatively easy to retrieve data and manipulate the request to the most convenient way. Basically mod_python provides a friendly access to Apache's API/Functionalities.

Install

Mod_python documentation has detailed information on how to install it (here]) , the suggested install is from source-code, but there is no problems on using a package install.

Assuming that mod_python is proper installed, the PyWPS will be installed as a handler, meaning wps.py will be the default script to be run. The default configuration should be inside the <Directory> section of httpd.conf or in the .htaccess file (please consult Apache's documentation for more information. In the following example we have a DocumentRoot of "/var/www/html" (defined in another section of the httpd.conf file) and the folder /wps will contain the wps.py script

<Directory "/var/www/html/wps">
 AddHandler mod_python .py
 Options Indexes FollowSymLinks
 AllowOverride None
 Order allow,deny
 Allow from all
 PythonHandler wps
 PythonDebug On
 PythonAutoReload On
 PythonOption PYWPS_PROCESSES /usr/lib/python2.6/site-packages/pywps/processes
 PythonOption PYWPS_CFG /etc/pywps.cfg
</Directory>

The mod_python configuration will work more or less like he wrapper script used in the CGI procedure. PyWPS process location and configuration files are still passed as parameters but in this case as mod_python's PythonOption.

One important note when working with mod_python: Setting PYWPS_PROCESS and PYWPS_CFG as environment variables inside Apache will not work:

SetEnv PYWPS_PROCESSES /usr/lib/python2.6/site-packages/pywps/processes
SetEnv PYWPS_CFG /etc/pywps.cfg

Mod_python will not "see" environment variables, only variables passed using PythonOption and then fetched from the request object will be used by the script

The AddHandler options informs Apache that any extention .py file should be processed by mod_python, PythonHandler points to the file that will accept the HTTP request, when using the PythonHandler it is only possible to have one handler per URL, even if the URL doesn't indicate the script, mod_python will assume that wps.py will process the request.

    • PythonDebug and PythonAutoReload are used in development environments, PythonDebug allows for debug message to returned to the client and PythonAutoReload checks for changes in the python code and reloads any new code, basically it tries to avoid caching of compiled code. In production enviroments is not advisable such options, better to use:
PythonDebug Off
PythonOptimize On

Also, one advantage of using mod_python is that PYTHONPATH is very simple to append (change):

PythonPath "sys.path+['/users/jesus/workspace/pywps-3.2-soap/pywps']"

In this case a pywps package in some other directory has been append to the PYTHONPATH

Example of basic authentication

Following the simple authentication example shown in mod_python documentation, a WPS could be protected with the following httpd.conf:

<Directory "/var/www/html/mod_python">
 AddHandler mod_python .py
 Options Indexes FollowSymLinks
 AllowOverride None
 Order allow,deny
 Allow from all
 PythonHandler wps
 PythonAuthenHandler wps
 PythonDebug On
 PythonAutoReload On
 AuthType Basic
 AuthName "Restricted Area"
 require valid-user
</Directory>

The PythonAuthenHandler indicates that wps.py contains function authenhandler() that will do some sort of authentication, if things go ok then the handler() function will deal with request. It should be enough to add the following code in to the wps.py:

def authenhandler(req):

 pw=req.get_basic_auth_pw()
 user=req.user

 if (user=="bacon" and pw=="eggs"):
    return apache.OK
 else:
    return apache.HTTP_UNAUTHORIZED

The authenhandler() gets the password and user information from the request object, checks if the user is "bacon" and the password is eggs, if so apache will continue processing the request (apache.OK) otherwise it will instruct Apache to return a HTTP Unauthorized access.

Filter In/Out

Another advantage of mod_python is the use of filters on the body of a HTTP request or response. Filter are simple to set but they can be a bit "tricky" since they don't have knowledge of when they are called of if they are called in the main request or some sub request. Normally it is necessary to make them context-sensitive.

Probably the simplest filter for WPS is an out filter that will encrypt the XML content. To start is is necessary to register the filter to mod_python and apache, inside the directory that contains the mod_python definiton:

PythonOutputFilter wps ENCRPYT
AddOutputFilter ENCRYPT .py

This says: We have a filter called ENCRYPT inside the handler wps (meaning inside file wps.py) the function in the handler is called outputfilter (PythonOutputFilter). then we add filter ENCRYPT (AddOutputFilter) that should be applied to all files with extension .py

The AddOutputFilter is defined in the mod_mine meaning that we can apply the filters to a file extension, or mimetype, for example we could have used the following filter that would be applied to any XML content:

AddOutputFilterByType ENCRYPT text/xml

Now that we have the filter set we need to include the outputfilter function in the wps.py file:

from itertools import izip, cycle
def xor_crypt_string(data, key):
   return ''.join(chr(ord(x) ^ ord(y)) for (x,y) in izip(data, cycle(key)))

def outputfilter(filter):
   req=filter.req #getting requirement
   s = filter.read() #You have always to read and then write the filter (no matter what)
   if req.status == apache.HTTP_OK:
      s_crpyt=xor_crypt_string(s, "FOSS4G")
      filter.write(s_crpyt)
   else:
      filter.write(s)
   filter.close() #Always close the stream otherwise it will write twice

Here we will use again the function xor_crypt_string() to make a simple XOR encryption.The filter will only encrypt the content if the HTTP is code 200, this way any error message or problem with not be encrypted and will be outputted as simple text

Above we have, the output filter receives the filter object that contains the HTTP request as one of its members, then the filter is read. If the HTTP request is 200 then it will encrypt the HTTP body content, after it will write everything to the filter. If the HTTP request is not 200 then it will just write the content back. Finally it will close the filter.

Remarks

In the end mod_python allows for a considerable manipulation of all environment outside the wps.py script, the advantage is that we just have the same script that doesn't need major changes to implement "exotic features" like request checks, filters, encryption.

--Wikiadmin 16:11, 10 January 2011 (UTC)