Introduction

From PyWPS
Jump to: navigation, search

What is WPS

WPS stands for Web Processing Service and it is a standard OGC (Open Geospatial Consortium) protocol to make GIS calculations/models available to the Internet. In theory WPS should be able to describe any process, run it using pre-defined input/output and report error and status.

As defined by OGC:

... provides rules for standardizing how inputs and outputs (requests and responses) for geospatial processing services, such as polygon overlay. The standard also defines how a client can request the execution of a process, and how the output from the process is handled. It defines an interface that facilitates the publishing of geospatial processes and clients’ discovery of and binding to those processes. The data required by the WPS can be delivered across a network or they can be available at the server.

WPS supports simultaneous exposure of processes via HTTP GET, HTTP POST, and SOAP, thus allowing the client to choose the most appropriate interface mechanism, normally the processes's metadata is gathered using HTTP GET while its execution is done using HTTP POST.

WPS defines 3 basic operations:

WPS defines three operations:

  1. GetCapabilities returns service-level metadata
  2. DescribeProcess returns a description of a process including its inputs and outputs
  3. Execute returns the output(s) of a process

Despite the specification that requests should be case insensitive, it is recommended to use the upper camel case standard in all sorts of WPS operation requests

GetCapabilities

GetCapabilities request returns basic service metadata, namely:

  1. Service Identification (keyworks and abstract describing service)
  2. Service Provider (who provides the service and how to contact person in charge [name, telephone etc])
  3. Operation Metadata (HTTP GET and POST description and links to operations)
  4. Processes offered (List of processes, with abstract, identifier, metadata etc)
  5. Languages supported (Languages supported by the service; normally the default value is English)
  6. WSDL file location (URI to Web Service Description Language file, allowing for use of service in other SOA] structure that support WSDL)

The GetCapabilities HTTP GET request will be like:

http://apps.esdi-humboldt.cz/pywps/?service=WPS&request=GetCapabilities

The Key-Value-Pair (KVP) REQUEST and SERVICE are mandatory, and the optional parameter Version can also be used. The http://foo/wps.py is the service URI. In general, parameters are formated in the form:...&key1=value1&key2=value2&...

The KVP can be translated into an XML request submitted via HTTP POST to the WPS

<?xml version="1.0" encoding="UTF-8"?>
<ows:GetCapabilities xmlns:ows="http://www.opengis.net/ows/1.1"
xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.opengis.net/ows/1.1 ..\wpsGetCapabilities_request.xsd"
language="en" service="WPS">
 <ows:AcceptVersions>
 <ows:Version>1.0.0</ows:Version>
 </ows:AcceptVersions>
</ows:GetCapabilities>

DescribeProcess

This request provides a means for a client to determine the mandatory, optional, and default parameters for a particular process, as well as the format of the data inputs and outputs:

The DescribeProcess HTTP GET request will be like:

http://apps.esdi-humboldt.cz/pywps/?service=WPS&version=1.0.0&request=DescribeProcess&identifier=all


The KVP identifier and WPS version are mandatory in this request; the response will return the following information:


  1. Service identification (Title and abstract)
  2. Identifier (Service's unique identifier)
  3. Data Inputs
  4. Data Outputs

identifier may be a process identifier as stated in the Capabilities response, or keyword ‘all’ for all process descriptions. The above HTTP GET URL is equivalent to the following XML request submitted using HTTP POST:

<DescribeProcess xmlns="http://www.opengis.net/wps/1.0.0"
 xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 ../wpsDescribeProcess_request.xsd"
 service="WPS" version="1.0.0" language="en-CA">
 <ows:Identifier>all</ows:Identifier>
 </DescribeProcess>

Description of Data Inputs and Outputs

Three types of inputs and outputs are defined in the OGC standard. LiteralData, ComplexData and BoundingBox data.

LiteralData Description

LiteralData can be any character string, float, date, etc normally described as Primitive datatype in the W3C XML Schema standard http://www.w3.org/TR/xmlschema-2/#built-in-primitive-datatypes; the WPS standard also allows the use of UOM (Unit of Measures), default values and AllowedValues, meaning the service can be "conscious" of unit types. In case of missing inputs it will use a default value, and an input should fall in a specific range.

For example, DescribeProcess for a given process may contain the following input description:

<Input minOccurs="0" maxOccurs="1">
 <ows:Identifier>method</ows:Identifier>
 <ows:Title>Interpolation method</ows:Title>
 <ows:Abstract>Interpolation method to be used in a dataset of points</ows:Abstract>
 <LiteralData>
 <ows:DataType ows:reference="xs:string">string</ows:DataType>
 <ows:AllowedValues>
 <ows:Value>idw</ows:Value>
 <ows:Value>kriging</ows:Value>
 <ows:Value>thysson</ows:Value>
 </ows:AllowedValues>
 <DefaultValue>kriging</DefaultValue>
 </LiteralData>

The XML above says: An input that may occur once with name "method" that is type string that can be equal to "idw", "kriging" or "thysso", but if no value is specified, use "kriging".

<Input minOccurs="0" maxOccurs="1">
 <ows:Identifier>BufferDistance</ows:Identifier>
 <ows:Title>Buffer Distance</ows:Title>
 <ows:Abstract>Distance to be used to calculate buffer.</ows:Abstract>
 <LiteralData>
 <ows:DataType ows:reference="http://www.w3.org/TR/xmlschema-2/#float">float</ows:DataType>
 <UOMs>
 <Default>
 <ows:UOM>meters</ows:UOM>
 </Default>
 <Supported>
 <ows:UOM>meters</ows:UOM>
 <ows:UOM>feet</ows:UOM>
 </Supported>
 </UOMs>
 <ows:AnyValue/>
 <DefaultValue>100</DefaultValue>
 </LiteralData>
 </Input>

In this case we have an input that is a float number and whose units can be in meters or feet; if the input is not specified, then 100.0 meters will be assumed.

Normally UOM, DefaultValue and AllowedValue aren't used, but the WPS can report errors if these inputs are incorrect or badly crafted.

Complex Data description

Complex Data data type is used for pasting complex - Vector- Raster- or other data to the server or obtain it as result of the process. There are two ways, how this complex data are handled

  • Either you send them directly as part of the request to the server or you obtain them as part of the XML response from the server. This is mostly done by vector data, using GML or other text-based format. For raster data, they can be encoded using base64 encoding
  • Or you send or obtain just reference to the data – URL to the file or service, where the data can be downloaded.

Normally a Complex Data type follows the logic: Format/Encoding/Schema, also a maxium file size of the input can be defined and the server will rise an error if exceded. For example a complex data example may be described as follow:

<Input minOccurs="1" maxOccurs="1">
<ows:Identifier>InputPolygon</ows:Identifier>
<ows:Title>Polygon to be buffered</ows:Title>
<ows:Abstract>URI to a set of GML that describes the polygon.</ows:Abstract>
<ComplexData maximumMegabytes="5">
 <Default>
 <Format>
 <MimeType>text/xml</MimeType>
 <Encoding>base64</Encoding>
 <Schema>http://foo.bar/gml/3.1.0/polygon.xsd</Schema>
 </Format>
 </Default>
 <Supported>
 <Format>
 <MimeType>text/xml</MimeType>
 <Encoding>UTF-8</Encoding>
 <Schema>http://foo.bar/gml/3.1.0/polygon.xsd</Schema>
 </Format>
 </Supported>
</ComplexData>
</Input>

In this case we a complex input that is mandatory (minOccurs="1"), that will be identifier as InputPolygon, whose size can't exceed 5 megas that as default it should be of type XML with a base64 enconding and if there is the need to validate it a schema can be found in: http://foo.bar/gml/3.1.0/polygon.xsd . Also an XML input using UTF-8 should be ok :)

Bounding Box Data description

Bounding Box, or just BBOX is the 3rd data type and it is used to describe some sort of bounding box area . The input description must state the default coordinate reference system (CRS) used, normally a URI to the EPSG code system and what other CRS are supported. Note: The supported CRS shall contain the default CRS.

<Input>
 <ows:Identifier>bboxInput</ows:Identifier>
 <ows:Title>bounding box of dummy polygon</ows:Title>
 <ows:Abstract>Bounding box of dummy polygon to be used for
fast polygon interception calculation</ows:Abstract>
 <BoundingBoxData>
 <Default>
 <CRS>urn:ogc:def:crs:EPSG:6.6:4326</CRS>
 </Default>
 <Supported>
 <CRSsType>
 <CRS>urn:ogc:def:crs:EPSG:6.6:4326</CRS>
 <CRS>urn:ogc:def:crs:EPSG:6.6:4979</CRS>
 </CRSsType>
 </Supported>
 </BoundingBoxData>
 </Input>

In the example above, it is expected a bbox input in the EPSG:4326 format version 6.6, meaning Lat/Long WGS84 and the service also supports the input in format 4979 that is also Lat/Long WGS84. So a KVP request would look like:

....&bboxInput=71.63,41.75,-70.78,42.90,urn:ogc:def:crs:EPSG:6.6:4326

where bbox=LowerCorner longitude,LowerCorner latitude,UpperCorner longitude,UpperCorner latitude,crs URI

Execute

The Execute request is the most important request since it will launch the specified process implemented by the service. In a Execute request a client (human or not...) must specify:

  1. Process identifier
  2. Input values as defines in the DescribeProcess
  3. Version and language
  4. Type of Output either:
    1. Stored in the server
    2. Contained inside the XML response
    3. Raw response of single output, dump the result to the client (for example an image)
  5. If the server shall return a status document (synchronous or asynchronous call)
  6. If the input data should be returned in the response document (lineage)


KVP Execute request

Execute requests are better done using XML since some of then can be complex. Nevertheless KVP (key-value pair) and HTTP GET can be used to run a Execute request following the protocol's guidelines and encoding system.

As example of a KVP Execute request for a process that has 3 LiteralData inputs (int,float and string)

http://apps.esdi-humboldt.cz/pywps/?
service=WPS&version=1.0.0&
request=Execute&
identifier=literalprocess&
datainputs=int=1;float=3.2;zeroset=0;string=spam&
storeExecuteResponse=false
lineage=true
status=false

The service and version are mandatory parameters like in DescribeProcess, datainputs contains the inputs which in this case have the following identification: int, float,zeroset and spam with assigned values 1,3.2,0,spam (respectively), the storeExecuteResponse=false requests that the server doesn't store the response document. The lineage parameter requests that the return document must include the inputs used and the status=false indicates that a reply should be synchronous, meaning no status document reply but a direct response with results.

Parameters like storeExecuteResponse or status depend on the process itself, meaning the author of the process defines whether it will support storeExecuteResponse and update.

The KVP inputs should be encoded using the following rules:

  1. A semicolon (;) shall be used to separate one input from the next
  2. An equal sign (=) shall be used to separate an input name from its value and attributes, and an attribute name from its value
  3. An at symbol (@) shall be used to separate an input value from its attributes and one attribute from another.
  4. All field values and attribute values shall be encoded using the standard Internet practice for encoding URLs
  5. pyWPS also supports the use of [ ] to group the datainputs as follows: datainputs=[int=1;float=3.2]

A literalData value could be coded as follows:

....width=35@datatype=xs:integer@uom=meter

An XML input as ComplexValue: ....complexFieldName=http%3A%2F%2Ffoo%2Ebar%2Fshapefile@Format=text/xml@Encoding=utf-8@Schema=gml.xsd In this case the XML input is located in http://foo.bar/shapefile and it is a xml file with schema gml.xsd

Bounding box example: bboxInput=46,102,47,103,urn:ogc:def:crs:EPSG:6.6:4326,2

The Execute request will generate the following response document:

<?xml version="1.0" encoding="utf-8"?>
<wps:ExecuteResponse xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns:ows="http://www.opengis.net/ows/1.1"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.opengis.net/wps/1.0.0
http://schemas.opengis.net/wps/1.0.0/wpsGetCapabilities_response.xsd"
service="WPS" version="1.0.0" xml:lang="eng"
serviceInstance="http://appd.esdi-humboldt.cz/pywps/?service=WPS&request=GetCapabilities&version=1.0.0"
statusLocation="http://apps.esdi-humboldt.cz/wps/wpsoutputs/pywps-128222636556.xml">
 <wps:Process wps:processVersion="None">
 <ows:Identifier>literalprocess</ows:Identifier>
 <ows:Title>Literal process</ows:Title>
 </wps:Process>
 <wps:Status creationTime="Thu Aug 19 15:59:25 2010">
 <wps:ProcessSucceeded>PyWPS Process literalprocess successfully calculated</wps:ProcessSucceeded>

 </wps:Status>
 <wps:ProcessOutputs>
 <wps:Output>
 <ows:Identifier>int</ows:Identifier>
 <ows:Title>Integer data out</ows:Title>
 <wps:Data>
 <wps:LiteralData dataType="integer">1</wps:LiteralData>

 </wps:Data>
 </wps:Output>
 <wps:Output>
 <ows:Identifier>float</ows:Identifier>
 <ows:Title>Float data out</ows:Title>
 <wps:Data>
 <wps:LiteralData dataType="float">3.2</wps:LiteralData>

 </wps:Data>
 </wps:Output>
 <wps:Output>
 <ows:Identifier>string</ows:Identifier>
 <ows:Title>String data out</ows:Title>
 <wps:Data>
 <wps:LiteralData dataType="string">spam</wps:LiteralData>

 </wps:Data>
 </wps:Output>
 </wps:ProcessOutputs>
</wps:ExecuteResponse>

The literal process is just a simple process used for debugging where the inputs are reported back as outputs.

Normally outputs with ComplexData can be rather verbose, for example the following Execute request will run a service that inputs an XML and Raster image and after a few seconds of pause it will return the inputs.

http://apps.esdi-humboldt.cz/pywps/?
service=wps&version=1.0.0&
request=Execute&
identifier=complexprocess&
datainputs=vectorin=http://apps.esdi-humboldt.cz/classification/traning_areas/training_areas_en.gml;
rasterin=http://rsg.pml.ac.uk/staff/jmdj/raster.tif;
pause=0&

The returned XML is extremely verbose and below it is just shown the beginning

<!-- Something something something -->
<wps:ProcessOutputs>
 <wps:Output>
 <ows:Identifier>vectorout</ows:Identifier>
 <ows:Title>Vector file</ows:Title>
 <wps:Data>
 <wps:ComplexData mimeType="text/xml">
<ogr:FeatureCollection
 xmlns:ogr="http://ogr.maptools.org/"
 xmlns:gml="http://www.opengis.net/gml">

 <gml:boundedBy>
 <gml:Box>
 <gml:coord><gml:X>-559044.5280103994</gml:X><gml:Y>-1177026.734255324</gml:Y></gml:coord>
 <gml:coord><gml:X>-554835.891394174</gml:X><gml:Y>-1169621.932698363</gml:Y></gml:coord>
 </gml:Box>
 </gml:boundedBy>
 <gml:featureMember>

 <ogr:features fid="F0">
 <ogr:geometryProperty><gml:Polygon><gml:outerBoundaryIs>
<gml:LinearRing><gml:coordinates>-555043.324615493183956,-1174010.838661683257669 -554930.435787564259954,
-1174159.005248340079561 -555085.657925966545008,-1174293.060731505509466
-555276.157823096611537,-1174201.338558813324198 -555191.491202149889432,
-1174088.449730884516612 -555043.324615493183956,-1174010.838661683257669</gml:coordinates>
</gml:LinearRing></gml:outerBoundaryIs></gml:Polygon></ogr:geometryProperty>
 <ogr:areaClass>1</ogr:areaClass>
 <ogr:classLabel>broad_leaved</ogr:classLabel>

<!-- more stuff-->
<ows:Identifier>rasterout</ows:Identifier>
<ows:Title>Raster file</ows:Title>
<wps:Data>
<wps:ComplexData mimeType="image/tiff">
SUkqAEZlAAAjIyMjIyMjIyMjIyMjIyMjIyM
<!-- even more stuff-->

WPS supports the use of links inside the reponse document pointing to where the outputs are stored, by doing a request as follows:


...responsedocument=vectorout=@asreference=true;
rasterout=@asreference=true

Where a new variable is introduced (responsedocument) that sets the output data attribute "asreference" to true, resulting in an easier XML response document

<wps:ExecuteResponse xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.opengis.net/wps/1.0.0
http://schemas.opengis.net/wps/1.0.0/wpsGetCapabilities_response.xsd"
service="WPS" version="1.0.0" xml:lang="eng"
serviceInstance="http://appd.esdi-humboldt.cz/pywps/?service=WPS&request=GetCapabilities&version=1.0.0"
statusLocation="http://apps.esdi-humboldt.cz/wps/wpsoutputs/pywps-128222940863.xml">
 <wps:Process wps:processVersion="None">
 <ows:Identifier>complexprocess</ows:Identifier>
 <ows:Title>Complex process</ows:Title>
 </wps:Process>
 <wps:Status creationTime="Thu Aug 19 16:50:33 2010">
 <wps:ProcessSucceeded>PyWPS Process complexprocess successfully calculated</wps:ProcessSucceeded>

 </wps:Status>
 <wps:ProcessOutputs>
 <wps:Output>
 <ows:Identifier>vectorout</ows:Identifier>
 <ows:Title>Vector file</ows:Title>
 <wps:Reference xlink:href="http://apps.esdi-humboldt.cz/wps/wpsoutputs/vectorout-22983" mimeType="text/xml"/>
 </wps:Output>
 <wps:Output>

 <ows:Identifier>rasterout</ows:Identifier>
 <ows:Title>Raster file</ows:Title>
 <wps:Reference xlink:href="http://apps.esdi-humboldt.cz/wps/wpsoutputs/rasterout-22983" mimeType="image/tiff"/>
 </wps:Output>
 </wps:ProcessOutputs>
</wps:ExecuteResponse>

XML Execute request

90% of the time a Execute request is done using an XML document with the same parameters as explained in the KVP section. The XML document is divided into 3 major sections:

  1. Service identification, language and version request
  2. Data inputs
  3. Structure of response document

The following KVP:

http://apps.esdi-humboldt.cz/pywps/?
service=WPS&version=1.0.0&
request=Execute&
identifier=literalprocess&
datainputs=int=1;float=3.2;zeroset=0;string=spam&
storeExecuteResponse=false
lineage=true
status=false

Would have the following XML:

<?xml version="1.0" encoding="UTF-8"?>
<wps:Execute service="WPS" version="1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsExecute_request.xsd">
 <ows:Identifier>literalprocess</ows:Identifier>
 <wps:DataInputs>
 <wps:Input xmlns:xlink="http://www.w3.org/1999/xlink">
 <ows:Identifier>int</ows:Identifier>
 <wps:Data>
 <wps:LiteralData dataType="xs:integer">1</wps:LiteralData>
 </wps:Data>
 </wps:Input>
 <wps:Input>
 <ows:Identifier>float</ows:Identifier>
 <wps:Data>
 <wps:LiteralData dataType="xs:float">3.2</wps:LiteralData>
 </wps:Data>
 </wps:Input>
 <wps:Input>
 <ows:Identifier>zeroset</ows:Identifier>
 <wps:Data>
 <wps:LiteralData dataType="xs:integer">0</wps:LiteralData>
 </wps:Data>
 </wps:Input>
 <wps:Input>
 <ows:Identifier>string</ows:Identifier>
 <wps:Data>
 <wps:LiteralData dataType="xs:string">spam</wps:LiteralData>
 </wps:Data>
 </wps:Input>
 </wps:DataInputs>
 <wps:ResponseForm>
 <wps:ResponseDocument lineage="true" storeExecuteResponse="false" status="false">
 <wps:Output asReference="false">
 <ows:Identifier>int</ows:Identifier>
 </wps:Output>
 <wps:Output asReference="false">
 <ows:Identifier>float</ows:Identifier>
 </wps:Output>
 <wps:Output asReference="false">
 <ows:Identifier>zeroset</ows:Identifier>
 </wps:Output>
 <wps:Output asReference="false">
 <ows:Identifier>string</ows:Identifier>
 </wps:Output>
 </wps:ResponseDocument>
 </wps:ResponseForm>
</wps:Execute>

The lineage, storeExecuteResponse and status are now attributes of the ResponseDocument element and asReference remains as an attribute of the output element

Response Document

In section "KVP Execute request" there's an example where outputs are requested as references by using the KVP ResponseDocument, this parameter can be used to generate the output document that only contains 1 (or more) of the processes outputs, for example if the user only needs the int output from process literalprocess (this process outputs int, float, zeroset and string)

http://apps.esdi-humboldt.cz/pywps/?
service=WPS&version=1.0.0&
request=Execute&
identifier=literalprocess&
datainputs=[int=1;float=3.2;zeroset=0;string=spam]&
storeExecuteResponse=false&
lineage=true&
status=false&
responsedocument=int

The response document will have ProcessOutputs section like this:

<wps:ProcessOutputs>
<wps:Output>
     <ows:Identifier>int</ows:Identifier>
     <ows:Title>Integer data out</ows:Title>
     <wps:Data>
            <wps:LiteralData dataType="integer">1</wps:LiteralData> 
     </wps:Data>
</wps:Output>
</wps:ProcessOutputs>

Please keep in mind that a lack of ResponseDocument will return all outputs, if ResponseDocument is present the user will have to specify what are the outputs necessary.

Lineage

A response document may contain the dataInputs used to run the WPS, this is normally referred as lineage. To request a lineage structure in the response document the KVP GET should contain lineage=true, as follows:

storeExecuteResponse=false&
lineage=true&
status=false&

NOTE: Lineage is an independent KVP that is NOT associated with responseDocument properties.

In POST XML request, lineage is an attribute in the ResponseDocument element:

<wps:ResponseDocument lineage="true" storeExecuteResponse="false" status="false">

Reference

WPS may output a specific content as reference, meaning the output is returned as an URL that can be used to fetch the content, allowing for the user to accept a "light" WPS response and later fetch the process outputs. This is very practical if outputs are big images or long XML structures.

asReference is specified as an attribute per output in the WPS POST request as follows:

:
<wps:ResponseForm>
 <wps:ResponseDocument lineage="false" storeExecuteResponse="false" status="false">
 <wps:Output asReference="false">
 <ows:Identifier>bigXML</ows:Identifier>
 </wps:Output>
 :
 </wps:ResponseDocument>
</wps:ResponseForm>

A GET KVP reference request are made using the output attribute e.g:

&responsedocument=bigXML=@asreference=true

The response document will contain the URL in a quoted format, that can be directly "piped" to another WPS or has to be unquoted to be used by other software such as a Browser or QGIS.

--Wikiadmin 14:34, 11 January 2011 (UTC)