Step 24 - workflow
instead of process
We already have quite some helper functionality that should be provided by a DiFlow module:
- Generating an output file name based on input
- Parsing different types of input file references
- Selecting the proper key from the (large)
ConfigMap
stored in the globalparams
Map
. - Generating the CLI from
nextflow.config
- Providing a test case for the module
There is some hidden functionality as well:
- Making sure input and output file references are updated in the
ConfigMap
- Dealing with per-sample configuration (upcoming feature)
As it turns out, providing all this functionality in a process
is not the proper way to go, and is even expected not to work. Luckily we can define workflow
s in NextFlow’s DSL2 syntax. Such a workflow
can be used just like a process
in the above example as long as we take care that the input/output signatures are aligned with what is expected.
The added benefit of using a workflow
rather than a process
is that the underlying process
can have a different signature. In practice, this is what a DiFlow process
looks like:
process cellranger_process {
...
container "${params.dockerPrefix}${container}"
publishDir "${params.output}/processed_data/${id}/", mode: 'copy', overwrite: true
input:
tuple val(id), path(input), val(output), val(container), val(cli)
output:
tuple val("${id}"), path("${output}")
script:
"""
export PATH="${moduleDir}:\$PATH"
$cli
"""
}
As can be seen, the ConfigMap
is not passed to the process
but instead the information in params
is used to generate an output
filename, extract the container image and generate the CLI instruction. Please note that input
here points to ALL possible input file references as per Step 23.
The workflow
that points to this process
is then defined as follows:
workflow cellranger {
take:
id_input_params_
main:
...
result_ = ...
emit:
result_
}