Step 17 - Add the output file to params
In this case, we would add a output = ...
key to nextflow.config
or provide --output ...
on the CLI. This is done in the following example:
// Step - 17
process process_step17 {
publishDir "output"
input:
tuple val(id), file(input), val(config)
output:
tuple val("${id}"), file(params.output), val("${config}")
script:
"""
a=`cat $input`
let result="\$a + ${config.term}"
echo "\$result" > ${params.output}
"""
}
workflow step17 {
Channel.fromPath( params.input ) \
| map{ el ->
[
el.baseName.toString(),
el,
[
"id": el.baseName,
"operator" : "-",
"term" : 10
]
] } \
| process_step17 \
| view{ [ it[0], it[1] ] }
}
The code that is run:
> nextflow -q run . -entry step17 --input data/input.txt --output output.txt
WARN: DSL 2 IS AN EXPERIMENTAL FEATURE UNDER DEVELOPMENT -- SYNTAX MAY CHANGE IN FUTURE RELEASE
[input, <...>/work/b1/7f1b50fa5f6e9fdb83bc656e8946e7/output.txt]
The result is:
> cat output/output.txt
11
We note that the params.output
occurs in the output:
triplet as well as in the script code itself. That’s quite important, otherwise NextFlow will complain the output file can not be found.
This approach does what it is supposed to do: make the output filename configurable. There are a few drawbacks however:
- We would have to configure the filename for every process individually. While this can be done (
params.<process>.output
for instance), it requires additional bookkeeping on the side of the pipeline developer. - It does not help much because the output filename for every parallel branch again has the same name. In other words, we still have to have the
publishDir
In all fairness, these issues only arise when you want to publish the output data because in the other case every process lives in its own (unique) work
directory.