Orchestration

From PyWPS
Jump to: navigation, search

Orchestration (Music maestro !!!!)

WPS is a nice way to call processes and run GIS in a SOA philosophy, but it requires that the user makes XML requests, process outputs, if necessary call other processes etc. so it's time consuming and not so "friendly".

Several strategies have been used to deal with orchestration in WPS: - A generic WPS process that accepts a workflow description and calls other WPS and reports an output. One good example is being developed by 52 North] (usage) - Use of BEPL to build a workflow and run it based in WSDL access to WPS (BPEL ) - Use of HTTP GET and POST in Galaxy platform/workbench.Fondazione Bruno Kessler has implemented it in the ENVIROCHANGE project

Orchestration has been used (successfully) in bioInformatics, the experience, work developed, software developed is far ahead compared with geoinformatics. In bioinformatics services run using WSDL/SOAP and interactions with user is made in a workbench software, wikipedia lists 20 workbench that help users assemble workflows and orchestrate services (see list)

In annex E and F of WPS document, there is a small description on how to use WSDL/SOAP inside WPS, the specification is very loose and it is just a recommendation, also examples using WPS/WSDL are scarce.....the WSDL/SOAP PyWPS implementation was done in such a way that could be integrated in a generic workbench.

PyWPS' s development was done in a way that it will integrate Taverna workbench as-best-as-possible using WSDL/SOAP, nevertheless other generic workbench should work. The chosen workbench Taverna ([ http://www.taverna.org.uk/ home page ]), is under heavy development and with a high number of functionalities, systems and with a growing "ecosystem" being created around it (for example: myExperiment were users can share their workflows), the taverna system is a desktop application but also a server that can run an orchestration request based on a XML workflow description sent to it.


WSDL Generation

WSDL file is obtained by making a WSDL request to the WPS instance as indicated in the WPS documentation:

http://foo/wps.py?WSDL

This file contains the description of all the processes request/response provided by the WPS server. This WSDL file is generated using a XSLT template applied to a describeProcess XML output, and is generated dynamically for each call.

So, it's enough for a process to be loaded by PyWPS to have its description inside WSDL.

The WSDL file will contain extra information like server name, location etc based on the pywps.cfg file

SOAP Structure

PyWPS implements SOAP 1.1 and 1.2 envelope basic requests, still lacks support for SOAP authentication and things like must mustUnderstand. The SOAP implementation is the minimum necessary to run a workflow bench.

Generic WPS requests

Generic WPS request, are the typical WPS requests using a SOAP envelope, a GetCapabilities would be like this:

<?xml version="1.0" encoding="UTF-8"?>

 <!--
 Equivalent GET request is
 http://foo.bar/foo?Service=WPS&Version=1.0.0&Request=GetCapabilities&Language=en-CA
 -->
<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
 xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/1999/XMLSchema">
 <SOAP-ENV:Body>
     <wps:GetCapabilities xmlns:ows="http://www.opengis.net/ows/1.1"
      xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 ../wpsGetCapabilities_request.xsd"
      language="en" service="WPS">
           <wps:AcceptVersions>
               <ows:Version>1.0.0</ows:Version>
           </wps:AcceptVersions>
     </wps:GetCapabilities>
 </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

PyWPS supports getCapabilities/DescribeProcess/Execute done this way, the WSDL file also "understands" that a generic WPS was made and will expect for a generic WPS response inside the envelope.

Execute_<ProcessName>

WSDL/SOAP starts to shine when Execute_<ProcessName> requests are made to the server. The Execute_<ProcessName> is an exclusive way to call an Execute request of a specific process, this will only work if done inside a SOAP request:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 <soap:Body>
 <ExecuteProcess_dummyprocess>
 <input1>10</input1>
 <input2>20</input2>
 </ExecuteProcess_dummyprocess>
 </soap:Body>
 </soap:Envelope>

In this case we call for the Execution of the dummyprocess with 2 inputs: input1 and input2. The process's inputs become XML elements of their own, there is no longer LiteralData, ComplexData or BBOX. Process name and input name is CASE SENSITIVE, and should match the names defined in the process class.

The output will be:

<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance">
<SOAP-ENV:Body>
<ExecuteProcess_dummyprocessResponse>
<output2Result>10</output2Result>
<output1Result>10</output1Result>
</ExecuteProcess_dummyprocessResponse>
</SOAP-ENV:Body></SOAP-ENV:Envelope>

Following the WSDL nomenclature (best practice?), the response document appends "Response" to the process execute request and "Result" to the outputs.

The WSDL messages for request and response of this process:

    • Request WSDL schema structure
<schema targetNamespace="http://www.opengis.net/wps/1.0.0">
<element name="ExecuteProcess_dummyprocess">
<complexType>
<sequence>
 <element minOccurs="1" maxOccurs="1" name="input2" type="xsd:float"/>
 <element minOccurs="1" maxOccurs="1" name="input1" type="xsd:float"/>
</sequence>
</complexType>
</element>
</schema>

Response WSDL schema structure

<schema targetNamespace="http://www.opengis.net/wps/1.0.0">
<element name="ExecuteProcess_dummyprocessResponse">
<complexType>
<sequence>
<element name="output2Result" minOccurs="1" maxOccurs="1" type="xsd:float"/>
<element name="output1Result" minOccurs="1" maxOccurs="1" type="xsd:float"/>
</sequence>
</complexType>
</element>
</schema>

Input and Output Names

WPS and OGC standards normally use a text value to define I/O identification

<ows:Identifier>Input1</ows:Identifier>

PyWPS will have to convert the string inside the Identifier element into a new element that will wrap the I/O content. Element names have to follow W3C rules concerning charsets (that can be used) [[1]]. For example it is common for WPS processes to copy a bash command structure, using the default flag nomenclature:

In some situations the text content maybe be converted into an illegal element name.

 <ows:Identifier>gdalinfo</ows:Identifier>
        <ows:Title>GDALinfo command</ows:Title>
        <ows:Abstract>GDALinfo command to check image properties</ows:Abstract>
        <DataInputs>
            <Input minOccurs="0" maxOccurs="1">
                <ows:Identifier>-nogcp</ows:Identifier>
:
:
            <Input minOccurs="0" maxOccurs="1">
                <ows:Identifier>-stats</ows:Identifier>
                <ows:Title>Read and display image statistics. Force computation if no statistics are stored in an image. </ows:Title>
:
:

The "-" or "--" commonly used to indicate a flat can't be used as starting char in an element name, the use of such chars will raise XML parsing exceptions.

Currently, SVN version 1129 uses regular expressions to remove unwanted chars in the WSDL process description and when a process is run the I/O names from SOAP are mapped to the original names in the WPS description. Therefore a user doesn't have to worry about flag/charset to use as identifiers

typeData schema

The WSDL Schema section defines the input/output message type, in the example above WSDL will contain information on the type (type="xsd:float") this information is automatically gathered from the describeProcess XML

<LiteralData>
<ows:DataType ows:reference="http://www.w3.org/TR/xmlschema-2/#float">float</ows:DataType>
<ows:AnyValue/>

It's easy to gather the data type for integer,float and string, the problem is for ComplexData. Currentely data contained inside ComplexData will not have a type set in the WSDL, in the majority of systems that consume WSDL it will be assumed that type="xsd:anyType". An example for a buffer GML output schema type:

<schema targetNamespace="http://www.opengis.net/wps/1.0.0">
<element name="ExecuteProcess_complexVectorResponse">
<complexType>
<sequence>
<element name="outdataResult" minOccurs="1" maxOccurs="1"/>
</sequence>
</complexType>
</element>
</schema>

Please check the XML Input/Output section for more detailed information on type="xsd:anyType" and other issues

Exception Report

A considerable effort was done on integrating SOAP-WSDL-OGC fault message /exception report. SOAP defines a special envelope in the case of error/exception, this envelope was integrated to some extent in PyWPS, with one simple assumption: Any error is caused by the client. The problem with SOAP is that it defines 4 types of fault errors, only the client error is implemented (see: [2]). SOAP fault has new tags that need filling:<faultcode>,<faultstring>,<faultactor> and <detail>, with <faultcode> and <faultstring> as mandatory elements in the response.

<faultstring> is a simple string describing the problem, in PyWPS it will contain the ows:Exception content and any text as CDATA, the <detail> element will contain the complete ows:ExceptionReport XML that can be parsed for extra information

<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance">
<SOAP-ENV:Body>

<SOAP-ENV:Fault>
<faultcode>SOAP-ENV:Client</faultcode>
<faultstring><![CDATA[<ows:Exception xmlns:ows="http://www.opengis.net/ows/1.1"
 xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exceptionCode="MissingParameterValue" locator="input2"/>
 ]]></faultstring>

<detail><wps:ExceptionReport xmlns:wps="http://www.opengis.net/wps/1.0.0"
xmlns:ows="http://www.opengis.net/ows/1.1"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <ows:Exception exceptionCode="MissingParameterValue" locator="input2"/>
 </wps:ExceptionReport>
 </detail>

</SOAP-ENV:Fault>

</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

<faultcode> will always contain the standard client error code: SOAP-ENV:Client

async WPS calls

async calls are a thorny subject.... WPS uses attributes to set it up, but there is no recommendation on how to pass it using a Execute_ structure, even setting attributes in Taverna workbench is complicated and needs the use of a beanshell script to put then in the service request. Please check section: Async request


--Wikiadmin 17:08, 10 January 2011 (UTC)