Link Search Menu Expand Document

Step 14 - Publishing output

Let us tackle one of the pain points of the previous example: output files are hidden in the work directory. One might be tempted to specify an output file in the process definition as such file("<somepath>/output.txt") but when you try this, it will quickly turn out that this does not work in the long run (though it may work for some limited cases).

NextFlow provides a better way to achieve the required functionality: publishDir. Let us illustrate its use with an example again and just adding the publishDir directive:

// Step - 14
process process_step14 {
    publishDir "output/"
    input:
        tuple val(id), file(input), val(config)
    output:
        tuple val("${id}"), file("output.txt"), val("${config}")
    script:
        """
        a=`cat $input`
        let result="\$a + ${config.term}"
        echo "\$result" > output.txt
        """
}
workflow step14 {
    Channel.fromPath( params.input ) \
        | map{ el ->
          [
            el.baseName.toString(),
            el,
            [ "operator" : "-", "term" : 10 ]
          ]} \
        | process_step14 \
        | view{ [ it[0], it[1] ] }
}

This single addition yields:

> nextflow -q run . -entry step14 --input data/input1.txt
WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE
[input1, <...>/work/1a/9fe9ead2ecfcf5db71d5c6a3acfeda/output.txt]

This example shows us a powerful approach to publishing data. There is a similar drawback as for the output filenames, however, and that is that the process defines the output directory explicitly. But there is a different problem as well, which can be observed when running on multiple input files:

> nextflow -q run . -entry step14 --input "data/input?.txt"
WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE
[input3, <...>/work/87/81d9a9c9a7ab73110834aa6d1cffcc/output.txt]
[input2, <...>/work/5f/85dfd25d7b0156782200fede6d9bc3/output.txt]
[input1, <...>/work/8e/028d5c3c1f072f5edc6d306f998ad2/output.txt]
> cat output/output.txt
11

What do you think happens here? Yes, sure, we publish the same output.txt file three times and each time overwriting the same file. The last one is the one that persists.