Step 24 - workflow instead of process

Step 24 - `workflow` instead of `process`

We already have quite some helper functionality that should be provided by a DiFlow module:

Generating an output file name based on input
Parsing different types of input file references
Selecting the proper key from the (large) ConfigMap stored in the global params Map.
Generating the CLI from nextflow.config
Providing a test case for the module

There is some hidden functionality as well:

Making sure input and output file references are updated in the ConfigMap
Dealing with per-sample configuration (upcoming feature)

As it turns out, providing all this functionality in a process is not the proper way to go, and is even expected not to work. Luckily we can define workflows in NextFlow’s DSL2 syntax. Such a workflow can be used just like a process in the above example as long as we take care that the input/output signatures are aligned with what is expected.

The added benefit of using a workflow rather than a process is that the underlying process can have a different signature. In practice, this is what a DiFlow process looks like:

process cellranger_process {
  ...
  container "${params.dockerPrefix}${container}"
  publishDir "${params.output}/processed_data/${id}/", mode: 'copy', overwrite: true
  input:
    tuple val(id), path(input), val(output), val(container), val(cli)
  output:
    tuple val("${id}"), path("${output}")
  script:
    """
    export PATH="${moduleDir}:\$PATH"
    $cli
    """
}

As can be seen, the ConfigMap is not passed to the process but instead the information in params is used to generate an output filename, extract the container image and generate the CLI instruction. Please note that input here points to ALL possible input file references as per Step 23.

The workflow that points to this process is then defined as follows:

workflow cellranger {
  take:
    id_input_params_
  main:
    ...
    result_ =  ...
  emit:
    result_
}