Batteries Included: Supercharging Bioinformatics Modules with Viash

TL;DR

Viash comes with powerful built-in features that would normally require significant additional coding: parallel batch processing for speed, container management for reproducibility, and integrated testing for reliability. These “batteries included” features save you from writing hundreds of lines of boilerplate code.


Integrated Testing

The Testing Challenge in Bioinformatics

Traditional testing involves: - Writing test scripts - Managing test data - Setting up environments - Comparing expected vs. actual outputs

These steps are often skipped, causing fragile tools. Viash makes testing a first-class citizen.

Built-in Testing with Viash

Example test script test.sh:

#!/bin/bash

echo ">>> Testing $meta_functionality_name"
"$meta_executable"  --input "$meta_resources_dir/test.paired_end.sorted.bam"  --output "$meta_resources_dir/test.paired_end.sorted.txt"

echo ">>> Checking whether output is non-empty"
[ ! -s "$meta_resources_dir/test.paired_end.sorted.txt" ] && echo "File 'test.paired_end.sorted.txt' is empty!" && exit 1

echo ">>> Checking whether output is correct"

diff <(grep -v "^# The command" "$meta_resources_dir/test.paired_end.sorted.txt")    <(grep -v "^# The command" "$meta_resources_dir/ref.paired_end.sorted.txt") ||    (echo "Output file ref.paired_end.sorted.txt does not match expected output" && exit 1)

rm "$meta_resources_dir/test.paired_end.sorted.txt"

echo ">>> All tests passed successfully."

exit 0

Viash config addition:

name: samtools_stats
arguments:
  ...
test_resources:
  - type: bash_script
    path: test.sh
  - type: file
    path: test.paired_end.sorted.bam

Run tests with:

viash ns test -q samtools_stats

Why Viash Testing Is a Game-Changer

  • Reproducibility: Tests run in the same environment as production
  • CI/CD Friendly: Easily integrates into pipelines
  • Version Control: Test code lives beside the component

Parallel Processing

The Challenge

Bioinformaticians often need to process many samples with: - Resource tracking - Logging - Monitoring

Viash enables built-in batch processing.

The Viash Way: Parameter Lists

Example param_list.yaml:

- id: sample_1
  input: test.paired_end.sorted_1.bam
  output: test.paired_end.sorted_1.bam
- id: sample_2
  input: test.paired_end.sorted_2.bam
  output: test.paired_end.sorted_2.bam
- id: sample_3
  input: test.paired_end.sorted_3.bam
  output: test.paired_end.sorted_3.bam

Run it with:

nextflow run target/nextflow/samtools_stats/main.nf   --param_list param_list.yaml   -profile docker   -publish-dir test

Why Batch Processing with Viash Rocks

  • Efficiency: No need to code parallelization logic
  • Flexibility: Sample-specific parameters supported
  • Simplicity: Easy YAML file defines all processing

Container Management

The Reproducibility Problem

Containers solve environment drift, but are often: - Hard to configure - Hard to version - Hard to debug

Viash to the Rescue

Viash handles: - Dockerfile generation - Container build + caching - Volume mounting - Lifecycle management

Custom Docker snippet in config:

engines:
  - type: docker
    image: quay.io/biocontainers/samtools:1.19.2--h50ea8bc_1
    setup:
      - type: docker
        run: |
          samtools --version 2>&1 | grep -E '^(samtools|Using htslib)' |           sed 's#Using ##;s# \([0-9\.]*\)$#: \1#' > /var/software_versions.txt

Inspect Dockerfile:

viash run src/config.vsh.yaml ---dockerfile

Debug mode:

viash run src/config.vsh.yaml ---debug

Why Viash Container Management Is a Game-Changer

  • No Docker Knowledge Needed
  • Consistent Environments Everywhere
  • Transparent Versioning and Caching
  • Multi-container Tech Support

What’s Next?

In the next post, we’ll show how to combine Viash components into modular workflows, such as RNA-seq pipelines.

Check out the Viash documentation for more.

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

Contact Us