Input/Output

From PyWPS
Jump to: navigation, search

Process Inputs and Outputs

Process inputs and outputs are of three types:

  1. ComplexValue - Usually used for raster or vector data
  2. LiteralValue - Used for simple text strings
  3. BoundingBoxValue - Two coordinate pairs of lower-left and upper-right corners in defined coordinate sytem.

Inputs and outputs should usually be defined in the init method of the process.

ComplexValue input and Output

ComplexValue inputs and outputs are used in WPS, to send larger sets of data (usually raster or vector data) into the process or from the process back to the user. The method pywps.Process.WPSProcess.addComplexInput() returns instance of pywps.Process.InAndOutputs.ComplexInput for inputs. For outputs, they are called pywps.Process.WPSProcess.addComplexOutput(), which gives pywps.Process.InAndOutputs.ComplexOutput back.

The pywps.Process.InAndOutputs.ComplexInput.value and pywps.Process.InAndOutputs.ComplexOutput.value attributes, do contain file name of the raster or vector file. For inputs, consider using pywps.Process.InAndOutputs.ComplexInput.getValue() method, for getting the value of the input, which can be returned as file object to you, or as file name.

For outputs, you should definitely use pywps.Process.InAndOutputs.ComplexOutput.setValue() for setting the results file name. The method accepts file objects as well as file name. Sometimes, users are sending the data as reference to some URL (e.g. OGC WFS or WCS service). PyWPS downloads the data for you and stores them to local file. If the client requires reference to the output data, PyWPS will create this for you. PyWPS is able to setup MapServer instance for you, and return OGC WFS or WCS URLs back to the client. For more on this topic, see using-mapserver.

Even you can (and should) define support data mimetypes (pywps.Process.InAndOutputs.ComplexInput.formats), mimetype only is checked. PyWPS does not care about valid schemas or anything else. Your Process should do this.

Vector data values

Vectors are usually handled as GML files. You can send any other file format as well, such as GeoJSON, KML or any other. Only condition is: the file should be in text form (so it can fit into XML correctly), if you want to append it as part of the input XML request and everything should be stored in one file.

Vectors are the default pywps.Process.InAndOutputs.ComplexInput.format of ComplexValue in- or output – text/xml (GML) is expected.

Note

Some users do want to send ESRI Shapfiles. This is in general not to be adviced: The file is binary format, which is hard to be used with XML, and it consists out of at least three files shp, shx and dbf. If you still want to handle shapefiles, you have either to zip everything in one file or define three separate complex inputs.

Example of simple input vector data:

self.inputVector = self.addComplexOutput(identifier="in",title="Input file")

Example of more complex input vector data:

self.gmlOrSimilarIn = self.addComplexInput(identifier="input",
                        title="Input file",
                        abstract="Input vector file, usually in GML format",
                        formats = [
                                    # gml
                                    {mimeType: 'text/xml',
                                    encoding:'utf-8',
                                    schema:'http://schemas.opengis.net/gml/3.2.1/gml.xsd'},
                                    # json
                                    {mimeType: 'text/plain',
                                    encoding: 'iso-8859-2',
                                    schema: None
                                    },

                                    # kml
                                    {mimeType: 'text/xml',
                                    encoding: 'windows-1250',
                                    schema: 'http://schemas.opengis.net/kml/2.2.0/ogckml22.xsd'}
                                    ],
                        # we need at least TWO input files, maximal 5
                        minOccurs: 2,
                        maxOccurs: 5,
                        metadata: {'foo':'bar','spam':'eggs'}
                    )

Raster data values

Sometimes, you need to work with raster data. You have to set proper pywps.Process.InAndOutputs.ComplexInput.formats attribute of supported raster file format. Since they are usually in binary form, you would have to send the data always as reference. Fortunately, this is not the case. PyWPS can handle the input data, encoded in Base64 format and once, PyWPS needs to send raster data out as part of Execute response XML, they are encoded with Base64 as well. Example of simple output raster data:

self.dataOut = self.addComplexOutput(identifier="raster",
                    title="Raster out",
                    formats=[{"mimeType":"image/tiff"}])

MimeType

PyWPS (revision 1081) is now mimeType enabled, meaning mimeTypes will be checked and an exception report will be raised. Working with mimeTypes can be tricky, for example and XML input/output can be text/xml or application/xml and it is complicated to automatically determine which one to use.

Any ComplexData input sent to the process will be checked and its mimeType compared with the formats attribute of the addComplexOutput method. An Exception report will be returned if the mimeType is different from the one(s) indicated.

:
 <wps:ProcessFailed>
    <wps:ExceptionReport>
         <ows:Exception exceptionCode="InvalidParameterValue" locator="inputImage1" />
    </wps:ExceptionReport>
 </wps:ProcessFailed>
:

This exception report is very basic since it has to follow WPS's specification. PyWPS logs the mimeType error as follows:

PyWPS [2010-12-16 17:09:38,004] DEBUG: inputImage1 has mimeType application/octet-stream according to magic.
MimeType not valid according to process

The log line indicates that the input was a application/octet-stream and this mimeType wasn't listed in the process

So what is this "magic" ?! PyWPS uses libmagic to check the mimeType of I/O complex content, this library is used by the bash command "file" to determine the mimetype, in case of doubt the user can use this command to check the mimeType of a file.

# file --mime-type <file_name>

Note: Other formats parameters like encoding or schema are not checked.

Outputs will also be checked for mimeType BUT no exception will be raised, in this case the error will be logged

PyWPS [2010-12-10 08:57:27,577] DEBUG: Incorrect mimetype in outputImage1

MimeType checking can be problematic, since PyWPS has to determine what is base64 (which looks like a string, and then decode it) and what is a plain text / XML.

It's recommended that for each output there's a corresponding mimeType


LiteralValue input and Output

With literal input, you can obtain or send any type of character string. You will obtain instance of pywps.Process.InAndOutputs.LiteralInput or pywps.Process.InAndOutputs.LiteralOutput class.

Literal value Inputs can be more complex. You can define list of allowed values, type of the literal input, spacing and so on.

Note

Spacing is not supported, so you can not currently define the step in allowed values row.

Type

For type settings, you can either use types module, or the type() function of python. Default type is type(0) – Integer. PyWPS will check for you, if the input value type matches allowed type.

Note

If you need the String type of literal input, PyWPS will always remove everything behind “#”, “;”, “!”, “&” and similar characters. Try to avoid usage of LiteralValue input directly as input for e.g. SQL database or command line programs. You could cause serious system compromise.

Allowedvalues

PyWPS let’s you define list of allowed input values. That can be string or integer or float types. Default values are defined in the list. Ranges are defined as two-items filed in form of (minimum,maximum) passed as list object ([min,max]). For example, we would like to allow values 1,2,3, 5 to 7, and ‘spam’, the pywps.Process.InAndOutputs.LiteralInput.values would look like:

[1,2,3,[5,7],'spam']

Default is “*”, which means all values. Simple example of LiteralValue output:

self.widthOut = self.addLiteralOutput(identifier = "width",
                     title = "Width")

Complex example of LiteralValue input:

self.litIn = self.addLiteralInput(identifier = "eggs",
                title = "Eggs",
                abstract = "Eggs with spam and sausages",
                minOccurs = 0,
                maxOccurs = 1,
                uoms = "m",
                type=type(0.0),
                default=1.1,
                allowedValues=[[0.0,10.1]])

BoundingBoxValue input and Output

BoundingBox are two pairs of coordinates, defined in some coordinate system, of two or three dimensions. In PyWPS, they are defined in pywps.Process.InAndOutputs.BoundingBoxInput, pywps.Process.InAndOutputs.BoundingBoxOutput. For getting them, use pywps.Process.WPSProcess.addBBoxInput() and pywps.Process.WPSProcess.addBBoxOutput() respectively. The value is list of four coordinates in (minx, miny, maxx, maxy) format. Example of BoundingBoxValue input:

self.bbox = self.addBBoxOutput(identifier = "bbox",
                          title = "BBox")

Gets and Sets

Inputs/Outputs are accessed using the getValue() and setValue() of LiteralValue/ComplexData/BoundingBox. The returned object can be a string a file object or a list containing several objects (See: Multiple inputs)

LiteralInput

To read an inputed literal value or complexData its enough to call the getValue:

inputValue = self.litIn.getValue()

The inputValue will contain the LiteralValue, either a string or a number.

To set a value its sufficient to pass any string/number to the setValue() method:

self.litOut.setValue("42")

ComplexData

ComplexData follows the same philosophy *BUT* getValue() will return a file name where the content is located (this allows for an easier integration with GRASS). If a user wants to access the "raw" data it will have to open the file and read the content:

GMLFileLocation=self.gmlOrSimilarIn.getValue() #GMLFileLocation is something like: ./pywpsInputVcUjAu (if input has no minOccurs)
GMLDataIO=open(GMLFileLocation,'r')

Then the IO object can be read for content (using read()) or immediately passed to some parser:

from lxml import etree
GMLTree=etree.parse(GMLDataIO)

setValue of ComplexData() accepts a file name string or a file object (its not possible to pass a StringIO.StringIO object), that will be used to fix the ComplexData output in the WPS response. If a user has an XML string it has to save it in in a file and pass the file object or name to method

XMLStr="<foo><bacon></bacon><eggs></eggs></foo>"
import tempfile
tmpFile=tempfile.NamedTemporaryFile(suffix='.xml',prefix='tmp',delete=False)
tmpFile.write(XMLStr)
self.GMLDataOutput.setValue(tmpFile)
#self.GMLDataOutput.setValue(tmpFile.name) is also possible

In the case above we create a permanent temporary file that will contain the XML string that in turn is passed to a ComplexDataOutput called GMLDataOutput.

From SVN revision 1020, it's possible to use StringIO and cStringIO objects inside the setValue method, so the example above can now be something like:

XMLStr="<foo><bacon></bacon><eggs></eggs></foo>"
import StringIO
tmpIO=StringIO.StringIO
tmpIO.write(XMLStr)
self.GMLDataOutput.setValue(tmpIO)

or in a compressed way:

self.GMLDataOutput.setValue(StringIO.StringIO("<foo><bacon></bacon><eggs></eggs></foo>"))

BBOX

Sets and gets of BBOX work as class instances. For example a get request:

BBOXObject=self.BBOXInput.getValue()

Will return a BBOX class instance with the following properties:

* coords
* crs
* dimensions

To access the coordinates:

CoordTuple=BBOXObject.coords

The returned tupple will have the following structure: ([minx,miny],[maxx, maxy]) for example: ([-11.0, -12.0], [13.0, 14.0]). Therefore:

BBOXObject.coords[0][0] #minx=-11
BBOXObject.coords[0][1] #miny=-12

BBOXObject.coords[1][0] #maxx=13
BBOXObject.coords[1][1] #maxy=14

Dimensions have 2 as default attribute, while crs has None.

The setValue() only allows for a tupple structure as follows:

self.bboxout.setValue(([-11,-12],[13,14]))

Multiple inputs

WPS allows for multiple inputs sharing the same identifier, for example

<wps:DataInputs>
   <wps:Input>
      <ows:Identifier>indata</ows:Identifier>
      <wps:Data>
          <wps:ComplexData mimeType="application/xml"><a><b/></a></wps:ComplexData>
      </wps:Data>
   </wps:Input>
   <wps:Input>
      <ows:Identifier>indata</ows:Identifier>
      <wps:Data>
          <wps:ComplexData mimeType="application/xml"><c><d/></c></wps:ComplexData>
      </wps:Data>
   </wps:Input> 
</wps:DataInputs>

In this case the input indata will contain 2 data objects (one "<a><b/></a>" and another object "<c><d/></c>").

Multiple inputs will be aggregated in a python list that will be returned when the user makes the getValue() call. The return object as a list will only happen when the ComplexInput definitions contains minOccurs and/or maxOccurs, for example:

self.indata = self.addComplexInput(identifier="indata",title="Complex in",formats=[{"mimeType":"application/xml"}],minOccurs=0,maxOccurs=1024)
:
:
self.indata.getValue() # returns an array like this: ['./pywpsInputzEb_7j', './pywpsInput0qke4W']

MimeType Problem w/ Multiple Inputs

It's common for image processing functionalities to work with different image types, in WPS an image type is identifier by its mimeType and it is not a problem for PyWPS to input multiple images at once (this is common in the WPS-GRASS-Bridge processes)

<wps:DataInputs>
  <wps:Input>
     <ows:Identifier>inraster</ows:Identifier>
     <wps:Data>
          <wps:ComplexData mimeType="image/tiff">SUkqAAgAAAARAAABAwABAAAA....=</wps:ComplexData>
     </wps:Data>
  </wps:Input>
  <wps:Input>
     <ows:Identifier>inraster</ows:Identifier>
     <wps:Data>
          <wps:ComplexData mimeType="image/png">xgTGBMYEyATGBMYEyQ....=</wps:ComplexData>
      </wps:Data>
   </wps:Input> 
</wps:DataInputs>

But PyWPS will not keep track of mimeTypes and all the raster images will be identified as having only one mimeType identical to the last image in the input list

self.inputs["inraster"].format --> [{'mimetype':'image/png'}]
self.inputs["inraster"].value -->  ['./pywpsInputzEb_7j', './pywpsInput0qke4W']

Hopefully in PyWPS 4.0 things will be --Jmdj 15:24, 6 July 2011 (BST)