Components for Designing and Executing Workflows

This K-Blog is a practical handbook for those interested in building scientific workflows with Taverna [1]. It is a guide to designing, building and executing workflows as well as finding and using existing workflows and understanding the processes involved in producing, managing and visualising their results.

Taverna is an open source workflow management system that allows scientists to chain together distributed analysis tools and data sources to enable the automation of complex informatics processes. Taverna was designed for the multitude of distributed bioinformatics resources that are now available, but there is nothing in the code which means it is only for bioinformatics. In recent years, scientists from other disciplines, such as astronomy, chemistry, medical informatics; and many others have also begun to use the workbench to design their experiments.

Taverna is an environment for building and executing workflows, but this is only part of the process of managing workflow experiments. Taverna is part of a much larger suite of software, the myGrid toolkit, which supports the whole in silico experiment life cycle. This includes; designing experiments, sharing and publishing experiments and managing the experimental outcomes. Figure 1 describes the different phases in the in silico experiment life cycle.

Figure 1: The in SilicoExperiment Life Cycle [2]

myGrid components include; the BioCatalogue, for service discovery and service monitoring, the myExperiment workflow repository for publishing and sharing workflows, a provenance management component for recording workflow outcomes, and the Taverna execution engine for running workflows either locally, or remotely via the Taverna server.

In this book we will explore the use of Taverna and myGrid in the field of bioinformatics, drawing on real-world examples of workflows in the wild. Bioinformatics, and related computational biology fields (e.g. Systems Biology and ‘omics analyses), are Taverna’s largest user communities.

[1] D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M.R. Pocock, P. Li, and T. Oinn, Taverna: a tool for building and running workflows of services. Nucleic Acids Res 34 (2006) W729-32.

[2] R. Stevens, J. Zhao, and C. Goble, Using provenance to manage knowledge of in silico experiments. Brief Bioinform 8 (2007) 183-94.

