R-cran

From PyWPS
Jump to: navigation, search

R is "GNU S", a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques: linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project [homepage http://www.r-project.org/] for further information.

R contains several spatial packages like: rgdal, rgeos, raster, RArcInfo, gstat etc etc etc, it's recommended for the user to check R's spatial view http://cran.r-project.org/web/views/Spatial.html and follow the installation procedure http://cran.r-project.org/web/views/. Any questions or problems can be solved by checking the archives, or sending an email to the [R-sig-geo https://stat.ethz.ch/mailman/listinfo/r-sig-geo] mailing list

PyWPS doesn't have a native interface for R, but it's not necessary, since R connects extremely well to Python using Rpy ([Rpy2 http://rpy.sourceforge.net/rpy2/doc-2.1/html/overview.html]) probably Rpy is the best R connector ever done....

Contents

Package loading

Rpy/Python code is run as normal inside PyWPS's execute() function, in a similar way to a GRASS processes, meaning the user will have to use the I/O getValue() and setValue to retrieve I/O data and pass it /set it in the R environment.

The major problem reporting by users that developed WPS services around R, is the verbose nature of some packages that make Apache raise Internal Server Error, for example rgdal library when loaded in a script causes the following crash error:

[Tue Apr 19 09:13:33 2011] [error] [client ::1] malformed header from script. Bad header=Geospatial Data Abstraction Li: wps.cgi

If a request is run on bash it will output the following:

Loading required package: sp
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 1.8.0, released 2011/01/12
Path to GDAL shared files: /usr/local/share/gdal
Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009
Path to PROJ.4 shared files: (autodetected)
Content-Type: application/xml

<?xml version="1.0" encoding="utf-8"?>
<wps:ExecuteResponse xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsExecute_response.xsd" service="WPS" version="1.0.0" xml:lang="en-CA" serviceInstance="http://localhost/wps.cgi?service=WPS&request=GetCapabilities&version=1.0.0" statusLocation="http://localhost/wpsoutputs/pywps-130320133939.xml">
    <wps:Process wps:processVersion="1.0">
        <ows:Identifier>bufferRGEOS</ows:Identifier>

Apache will receive the package loading verbose, assume that it's a malformed HTTP header and raise an error.

R options could be changed to reduce the package loading verbose, by setting it on R's [options http://stat.ethz.ch/R-manual/R-patched/library/base/html/options.html]. therefore overcoming the problem.

Another solution is for PyWPS to divert the stdout and prevent it being passed to Apache. For extra information on stdout and sterr see: http://diveintopython.org/scripts_and_streams/stdin_stdout_stderr.html

Black hole silence

R verbose can be directed to "/dev/null" and disappear in a "black hole". This has to be done before package loading.

  def execute()
     :
     import sys
     import rpy2.robjects as robjects
     R = robjects.r
     #Silence mode...
     sys.stdout=open("/dev/null","w")
     R["library"]("rgdal")
     :
     #Back to normal
     sys.stdout=sys.__stdout__

After running the R code, the stdout should be redirected to the default status, otherwise the WPS response will end up in /dev/null

A more portable solution (no /dev/null) can be found here [1]

Exceptions

Switching stdout "off" can cause problems when exceptions are raised, since they are also silenced, one solution would be to wrap the code in a try,except clause that will switch stout "on" in case of problems.

R = robjects.r
try:
     #Silence mode...
     sys.stdout=open("/dev/null","w")
     R["library"]("rgdal")
     <more R code>
except:
      sys.stdout=sys.__stdout__
      raise Exception

Logged silence

A better solution is to redirect R verbose to the pywps logs, therefore if there's a problem with package loading, things will be logged.

It's enough to redirect stdout to the pywps log like this:

  def execute()
     :
     import sys
     import rpy2.robjects as robjects
     R = robjects.r
     #Dumping to log file
     sys.stdout=self.logFile
     R["library"]("rgdal")
     :
     #Back to normal
     sys.stdout=sys.__stdout__

Supress Package Startup Message

The 3rd possibility is to use the R function supressStartupMessage(), for package loading e.g:

R["suppressPackageStartupMessages"]("library(rgdal)")

This solution is not bullet proof since some modules use the cat command instead of the message command to output information. If cat is used the message will not be suppressed.

Library location

According to the server's configuration it may be necessary to set the R env variable R_LIBS or R_LIBS_USER. This is normally done at the start of the process before loading rpy2, e.g:

def execute(self):
    import os
    os.environ["R_LIBS_USER"]="/home/user/R/x86_64-unknown-linux-gnu-library/2.12"
    import rpy2.robjects as robjects
    :

R process Example

The following is a buffer process using the rgeos library, there are several ways to implement the process,e.g., more python-R object or more native R code. The example follows a more native R code approach.

#request=execute&service=WPS&version=1.0.0&identifier=bufferRGEOS&datainputs=[data=http://rsg.pml.ac.uk/wps/testdata/simplePoly.gml;width=10]

from pywps.Process import WPSProcess

class bufferRGEOS(WPSProcess):
   def __init__(self):
        WPSProcess.__init__(self,
            identifier = "bufferRGEOS",
            title="Buffer creation using RGEOS library",
            abstract="""Buffer creation using RGEOS, GML 2.1.2 I/O""",
            version = "1.0",
            storeSupported = "true",
            statusSupported = "true")

        
        # Adding process inputs
        
        self.data = self.addComplexInput(identifier="data",
                    title="Input GML: http://rsg.pml.ac.uk/wps/testdata/simplePoly.gml",
                    formats = [{'mimeType':'text/xml'}])

        self.width = self.addLiteralInput(identifier="width",
                    title = "Some buffer widh",
                    type=type(0.0))

        self.buffer=self.addComplexOutput(identifier="buffer",title="Buffer output as GML 2.1.2")
        
    def execute(self):
            
        import rpy2.robjects as robjects 
        import sys
        
        R = robjects.r
            
        #R verbose to pywps.log
        sys.stdout=self.logFile
            
        #importing GDAL/RGEOS
        R["library"]("rgdal")
        R["library"]("rgeos")
            
        #WPS input to R
        R('poly<-readOGR(dsn="%s",layer="simplePoly")' % self.data.getValue())
        
        R('buffer<-gBuffer(poly,width=%s)' % self.width.getValue())
        R('fakeDF<-data.frame(foo=c("NULL"))')
        R('row.names(fakeDF)<-c("buffer")')
        R('bufferOut<-SpatialPolygonsDataFrame(buffer,fakeDF)')
        R('writeOGR(bufferOut,dsn="./out.gml",layer="simplePoly",driver="GML")')
            
        #Normal stdout
        sys.stdout=sys.__stdout__
            
        #WPS output from R
        self.buffer.setValue("./out.gml")
           
        return

--Jmdj 08:28, 20 April 2011 (BST)