on December 13, 2010 by Katy Wolstencroft and Robert Stevens in Features of Taverna, Comments (2)

Domain services and Shim Services

Conceptually, there are two different types of services used in Taverna; domain services and shim or “helper” services.

  • Domain services perform a scientific function. These services are generally provided by third parties and cannot usually be altered or changed. For example, the GenBank, EMBL and BioMart databases and the NCBI_BLAST sequence similarity and alignment tool.
  • Shim services do not perform scientific functions. They are ‘helper’ services, or data transformation services. They are used to connect together domain services when their data types or formats are incompatible. For example, a shim would be used to change the format of a plain text protein sequence to a Fasta formatted protein sequence if the next service in the workflow only accepted Fasta-formatted sequences. (Hull et al., 2006; Radetzki et al., 2006). The term ‘shim’ is derived from engineering where it means a thin piece of metal or wood used to fill in space between ill-fitting components.

Unlike domain services, shims can often be created by the scientist designing a workflow. A rule of thumb for distinguishing a domain service from a shim service is that a workflow, when the shims services are invisible, is equivalent to the methods section of a scientific paper. If a service needs to be explicitly mentioned in the method, then it is not a shim. This distinction is an important one for describing services and affects the way scientists might discover and use them.

Choosing a domain service and choosing a shim service are different. There are many BLAST services and the all do ‘the same thing’ i.e. they perform BLAST analyses. However, in reality, they may not do the same thing. Each one searches different underlying DNA and protein databases. If some search the same underlying databases, they may not use the same version of those databases. The version of the BLAST algorithm itself might also be different. The only way to determine this is by using the experience of the scientists involved in the experiment. In contrast, shim services are much simpler. The implementation is immaterial as long as the input and output are correct.

In a Taverna workflow, because there is no type system for all the bioinformatics services, shims are common. Figure 1 shows a taverna workflow with the shims highlighted – note that shims are needed in between nearly every domain service.

Figure 1: a workflow showing Shim services and domain services

Finding Services

There are two ways of finding domain services:

  1. Using the BioCatalogue, a registry of services for the Life Sciences
  2. Using the Taverna services panel.

Shim or helper services are found:

  1. The BioCatalogue
  2. Through Taverna’s own local services in the service panel
  3. By taking a “shim” from another workflow, for example by looking at myExperiment and searching for “shim”.
  4. The final option is to write a Beanshell script to “shim” services together.

2 Comments

  1. The State of the Bioinformatics Nation | The Taverna Knowledge Blog

    December 13, 2010 @ 3:44 pm

    […] to a tools and then on to another component in the pipeline. Heterogeneity can be addressed by adding local fixes to the data, as they stream through the pipeline, to fix incompatibilities. Finally, workflows inherently […]

  2. The Taverna Knowledge Blog

    February 6, 2011 @ 2:17 pm

    […] Additional shim services were added to format data into the correct input/output style, these services have not been assigned labels in Figure 2. An example workflow where these have been highlighted is given at: domain services and shim services. […]

Leave a comment

Login