on December 13, 2010 by Stian Soiland-Reyes in Features of Taverna, Comments (0)

Parallel service invocations

As Taverna workflows are data-driven, you don’t normally need to specifically define when a service is to be invoked, only which inputs it is to receive. Taverna will execute the service as soon as all the required inputs from above has been received.

This means that if you have service B and C which both take their input from service A, as soon as A returns a value, both B and C will be started. This kind of implicit parallelisation in Taverna will normally help increase data throughput as C does not depend on B.

In the workflow Biomart and 2x BLAST, both services blast_ddbj_rodents and blast_ddbj_invertebrates will execute at the same time.

biomart-blast-rodent-invertebrates

The string constants ddbjinv, ddbjrod and program_value will execute as soon as the workflow starts (as they don’t take any inputs). However, execution of the two BLAST services is delayed until their query ports have been populated from the hsapiens_gene_ensembl service. As soon as the Ensembl service outputs its values, iterations over both Blast services will start in parallel.

rodent-invertebrates-running

In this example, each Blast service is processing one input at a time, independently. Depending on the service processing times, this means that one service could be iterating faster than the other, as has happened in the screenshot above where blast_ddbj_rodents have processed 2 items, while blast_ddbj_invertebrates have done 6.

In many cases this behaviour will give you a speed-up of the workflow execution, as you can branch out to several separate services as soon as possible.

When to avoid parallelism

If the parallel services are executed on the same infrastructure (in this case on xml.nig.ac.jp), the backend servers might actually become slower when you are invoking several methods at the same time, as all machines have limited processing power and working memory. If this is the case for your services can only be discovered empirically unless it is mentioned in the documentation about the service. It could also be that too many parallel invocations on the same service might speed up your workflow, but will slow down execution for other service users, as you would be taking a larger share of the available resources on the servers.

In some cases the service is not concurrency safe and can only be invoked separately, for instance because both calls attempts to write to the same local file. In such a case the workflow above will not execute correctly. The simplest way to enforce separate executions here is to add a control link from one service to the other, make blast_ddbj_rodents run after blast_ddbj_invertebrates. In the long run the service provider should really be making the service concurrency safe, as it could still be broken by concurrent (but separate) workflow runs, possibly by different users.

biomart-blast-rodent-invertebrates-control-links

If a service is executed locally, using command line invocation, external tools, API consumer, Beanshell, R scripts, or even nested workflows and some local worker services, you as the “service provider”, will have to make sure that such parallel invocations within a workflow are concurrency-safe or used with control links. A good hint to a service that is not safe is the use of hard-coded file paths instead of auto-generated temporary files – two concurrent calls could overwrite the same file.

Parallel iterations

In the example workflow above, the two BLAST services were invoked in parallel because they didn’t have any data dependencies between them, but within the iterations only one concurrent call was made to the service. It is also possible in Taverna to increase the number of concurrent jobs on an individual service in a workflow,  by right-clicking and selecting Configure running –> Parallel jobs.

parallel-jobs

Increasing this number means that the iteration will keep submitting jobs to the service until the maximum is reached. Note that if there’s not enough input values available from upstream services, less than the specified number of concurrent jobs might be running.

It is advisable to check the service documentation or contact the service provider to confirm how many concurrent jobs is acceptable for reasonable use. Setting this number too high might cause a denial of service attack and could get your machine, or even your whole institution, to be temporarily or permanently blocked by the service provider.

As above, if increasing this number will give you better or worse performance can only be verified empirically. For many services a setting of 3 gives good performance, but often not more than say a 2x speed-up, as the 3 concurrent calls will be competing for server resources. Settings like 100 will most likely cause the service to fail.

Even for services where two concurrent invocation gives no direct speedup (say both calls are disk-bound), a setting of 2 could still give speedups by reducing the effect of overhead, invocation #1 could be up- or downloading data while invocation #2 is being processed on the server.

In some cases you might find that too many concurrent jobs causes connection issues or errors from the service. This is a good hint that you should reduce the number of jobs. You can also configure retries so that Taverna will retry job submission on errors.

Some service types in Taverna are deliberately made to not perform concurrently, even if the job number have been increased. This is currently the case for both R scripts and Beanshell scripts,  as they have shown slow-down or concurrency issues when run in parallel.

Bottlenecks and out-of-order iterations

If we add a local worker Concatenate_two_strings and connect it to the two BLAST services (using a dot product), then you will notice that its execution is held back by the slowest upstream service.

bottleneck

In some cases you might find that the bottleneck service have processed even less items than the slowest upstream service. This could happen when Maximum jobs is set to more than 1 and service invocations take uneven time. For instance blast_ddbj_rodents might have returned item 1, 4 and 5 (2 and 3 are still running), while blast_ddbj_invertebrates have returned item 1, 2 and 3 – in which case Concatenate_two_strings only have the inputs to execute for item 1 on both input ports.

If blast_ddbj_invertebrates returned item 4 next, inputs for item 4 on both ports would be processed by Concatenate_two_strings.  (If using a cross product, inputs 1×4, 4×4, 5×4 would be processed).

Note that Taverna takes care of preserving the order of iteration items – so the output of the second invocation of Concatenate will still end up in index 4 in the returned list – which is why you might sometimes see gaps in the list with Waiting for data in the Workflow results tab.

This out-of-order invocation is due to services starting as soon as their inputs are available, but should not normally be any cause for concern. However, if your service is expecting to be invoked in list order, you will need to delay execution of the vulnerable service using a control link. The downside of this is that the service will not start until all upstream iterations have been completed.

Similarly, if Concatenate_two_strings was replaced with a worker expecting an input list, like Merge_string_list_to_a_String, it will not start until the full list input is available. However, if the iterations above are outputting a list of a list (depth 2), invocation of the service will start as soon as any of the inner lists are complete.

Tags: , ,

No Comments

Leave a comment

Login