PyWPS 4.0 Ideas

From PyWPS
Jump to: navigation, search

Contents

Be Python 3 ready - use Python 2.7

In theory, if all the code is Python2.7 complient it shouldn't be a problem to use 2to3 tool to port into python 3, when the time comes, and necessity demands it

Speed up - be PyPy-complain

-PyPy uses Restricted Python that limits the set of instructions to be used, this may cause problems in some sections of the code
-Use of PyLint to audit and check what source code needs to be checked
- If to much code needs to be changed, the it's better to drop this option


Interesting proof of concept, Python running faster than C [1]

PyPy

PyPy 1.6 (C compiled and non compiled) has been tested and PyWPS3.2 works fine with PyPy it just needs for the module magic to be added to pypy package folder, since magic is a ctype module to lib magic it works without problems.

PyWPS's code is to linear (meaning no big loop) for the JIT to be effective. Therefore PyPy runs 5x slower than CPython. The benchmark was done with benchmark.py script located in the test folder.

PyPy should be able to provide acceleration to WPS processes that have big loops, but so far a process using GRASS hasn't been tested, also the GDAL wrappers don't use ctype, the Django GDAL modules do use ctype to access libgdal

PyPy's C translation hasn't been tested, since PyPy runs PyWPS is shouldn't be a problem the python conde into C and compile it

--Jmdj 09:26, 16 March 2011 (UTC)

Use lxml for XML parsing AND writing

- Use of XSLT to output WPS content ?
(to complicated maybe better template engine) - Massive XML parsing acceleration
- Low memory foot print
-Maybe XML input/output can be parsed to to WPS object using schema referencing?

Bring assynchronous calls to windows

-Multiprocessing python module, it maybe a problem with PyPY ?!

In the debug mode, validate input XMLs against schemas.

Ok, but WPS complex I/O contains schemas, and parsers prefer all schemas to be located in root document

Prepare for WPS 2.0

When is a draft document going to be produced ?! Bastian said, it should be during the winter. Nobody knows. --Jachym 17:25, 22 February 2011 (UTC)

More closer integration with GRASS GIS

Both, 6.x and 7.x branch.
Better support for multiple inputs with one identifier, for example mimeType for each input and not for each input identifier

XML output generation

- Some XSLT hacking? extensions, classes
- Simpler solution: ComplexDataClass/LiteralDataClass/BBoxClass to have a __xml__ method that generates specific XML for each class. Adding all the class instance we will have the WPS output document

License change

Rising this question once again: Is GPL really the best license for PyWPS? Isn't it too restrictive? It is difficult to bind some 3rd party software as PyWPS process, because than it would have to be released under GNU GPL as well?

Some decision helper: http://www.zdnet.com/blog/burnette/howto-pick-an-open-source-license-part-1/130

I think releasing PyWPS under BSD-type of license is the best choice. --Jachym 11:42, 23 February 2011 (UTC)

Process concept

Is the concept of The Process within PyWPS the right one? Currently, process is

  • Instance of pywps.Process.WPSProcess
  • Class derived from pywps.Process.WPSProcess

processes package (directory) has to be setuped. "Registering" a process means to edit __init__.py file within processes directory.

Should we do it in different way? Using configuration file?

degree3 and ZOO WPS use XML/YAML files to describe a process and then use it to create the process. Another problem is that module loading (each process is a module) is extremely time consuming, for example loading 100 processes takes so much time that sometimes I get time-outs. WPS-Grass-Bride means a hundreds of small processes being loaded.

Using XML file description would allow for the development of a PyWPS admin plant form like the one in 52 North --Jmdj 09:23, 25 February 2011 (UTC)

Use subprocess for assync calls

subprocess.Popen maybe? Anyway: bring asycn to windows and java!

Or maybe using a thread structure that runs the process, even for sync processes. Meaning the process is launched as a thread while the rest of the code will constantly check on the thread in case of a async call, or only check result when the thread has finished (sync). Python has a "fake" thread system (GIL threads) that runs everything in the same CPU core and thread just rotate inside the Python interpreter, or using the new multiprocessing package (v 2.6) that supports "real" thread in different cores.

Using GIL threads shouldn't be much of a performance issue, since 99% CPU will be busy running the process thread, also thread module runs on win32/linux/solaris and is supported in Jython [2]

--Jmdj 09:02, 16 March 2011 (UTC)

Well, threads should be option too. Subprocess is not really the best performance solution.

Anyhow, I've already implemented this to trunk (did not test on Windows yet).

--Jachym 00:30, 28 September 2011 (BST)

Web Server

Maybe using a dedicated web development framework like web2py ?! This would probably allow for REST services

--Jmdj 09:02, 16 March 2011 (UTC)

I like Django

--Jachym 00:30, 28 September 2011 (BST)

Decorators

Maybe processes could be served not as a WPSProcess class but with a simpler decorator structure around a function ?! Something more or less like this

    wps = PyWPS()
    @wps.process(returns={'result':int},args={'a':int,'b':int,})
    def execute(a, b):
        return a + b
    

--Jmdj 09:09, 16 March 2011 (UTC)

Template engine

Changing to a better supported/documented template engine, specially one that would accept python objects.

  • Mako (fast)
  • Cheetah (fast)
  • QPy (a lot faster than Mako?)
  • Pyratemp (small)

--Jmdj 08:54, 21 March 2011 (UTC)

Jython

Jython doesn't fully support lxml !!!!

Therefore, Xpath, XML in Jython is supported by other libraries

PyWPS4.0 main structure should be XML parser agnostic, meaning it should have like a XML parsing structure, something like: {{{ import pywps.XML.parser

xmlDoc=pywps.XML.parser.parseString("<a></a>") }}}

The parseString would call a parsing class/method in lxml if using CPython or DOM4J if working in Jython

Suggestions ?!

WSGI

Use of WSGI [5] [6] [7] this would allow for PyWPS to work like a "JAVA servelet" where it can be ported to several web application frameworks, such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web, aside from being able to run inside the mod_wsgi in apache

--Jmdj 10:28, 25 March 2011 (UTC)

Implemented trunk, did *really* test yet.

--Jachym 00:31, 28 September 2011 (BST)

generateDS

  • generateDS uses qualified elements: <xsd:foo> or the default <xs:foo> otherwise it gives a NoneType error
  • include is nor working properly in generateDS, better to include everthing in a file
  • WPS wpsDescribeProcess_request.xsd for generateDS: [8]
  • running: python /usr/bin/generateDS.py -a "xsd:" -o wpsDescribeProcess.py -s wpsDescribeProcessSubs.py test.xsd, we obtain 6000 lines of code !!!