on December 13, 2010 by Alan Williams in Features of Taverna, Comments (0)

R scripts

R is a commonly used language and environment for statistical calculation.  R is free and opensource.  A large amount of documentation on R is available online including the user manual and the language definition.

An R script can be included as a service within Taverna.  When the service is called, Taverna uses an R server to evaluate the script. In Taverna you can load, save and edit the scripts for the service.

There are a large number of functions available for R that are grouped into R packages.  Once a package has been installed within your R installation, you can include calls to those R functions within the R script.

At the moment, you cannot call a function directly, but must put it within a script.

When to use R in Taverna

You should consider using R in a Taverna workflow if you want to do mathematical or statistical calculation, or if you already have a script or R function that you want to call. Because R is designed for mathematics, it has excellent capabilities for handling arrays and matrices. So it is often simpler to write a script in R than in, say, Beanshell. For example,

sum_of_vector = sum(integer_vector_input);
modulus_integer_vector = unlist(lapply(integer_vector_input,
                                       function(x) 100 %% x));

sum(integer_vector_input) in a single R function call calculates the sum of the elements in an array; a script-specific function function(x) 100 %%x is applied to all the elements of an integer array.

R also has plotting functions that can be used to create graphs in various formats.

Because R is so capable, it may seem unnecessary to include R scripts within an overall workflow. Including the R script within Taverna allows the passing of data from non-R services, so you can pass data from WSDL or REST services into R and feed the results of an R calculation into other services. For example, this workflow on myExperiment uses WSDL services to fetch population data about the countries of the world and then uses R to produce a plot of that data.

Using R within Taverna allows you to pass data between R servers on different machines. This is useful if the R servers have site-specific data that cannot be easily transferred, or you wish to access R installations with different packages.

The data that is input to and output from R scripts can be kept as part of Taverna’s provenance collection. This helps when you are trying to debug or reproduce statistical calculations, or when you need to produce reports about the experiments you ran.

How to use R within Taverna

To make use of Rshell services in Taverna, you have to:

  1. Add an Rshell service to your workflow
  2. Configure the Rshell service by:

Example workflows using R

Example workflows containing R services are available on myExperiment.

No Comments

Leave a comment