Building Powerful Workflows with Viash Components

TL;DR

Viash transforms how you build bioinformatics workflows by providing reusable, tested components that integrate seamlessly into your existing workflows and environment. This approach produces workflows that are more sustainable than traditional methods.


The Hidden Cost of Traditional Bioinformatics Workflows

Challenges: - Complex & redundant glue code - Limited modularity - Monolithic growth - Painful maintenance - Error-prone execution - Limited resource specification - Implicit data flow - Missing documentation - Complex debugging

Viash addresses these with a component-based approach and automated code generation.


Viash: A Component-Based Approach

Clone the repo to follow along: https://github.com/viash-hub/playground

Viash workflows consist of: - A config file: defines components, parameters, engines - A Nextflow script: orchestrates execution

Example: Mapping and QC Workflow

Viash Config (src/mapping_and_qc/config.vsh.yaml)

name: mapping_and_qc
description: Run STAR and QC
arguments:
  ...
dependencies:
  - name: cutadapt
    repository: bb
  - name: falco
    repository: bb
  - name: multiqc
    repository: bb
  - name: star/star_align_reads
    repository: bb
  - name: samtools/samtools_stats
    repository: bb

repositories:
  - name: bb
    type: vsh
    repo: vsh/biobox
    tag: v0.3

runners:
  - type: nextflow

engines:
  - type: native
  - type: docker

Build and Run

# Build the workflow
viash ns build -q mapping_and_qc

# Generate test data and a parameter file
./test_data.sh

# Run the workflow
nextflow run .   -profile docker   -main-script target/nextflow/mapping_and_qc/main.nf   -params-file params_file.yaml   --publish_dir workflow_test

# Get help documentation
nextflow run target/nextflow/mapping_and_qc/main.nf --help

Viash converts ~90 lines of VDSL3 into ~3500 lines of robust Nextflow code using deterministic code generation.


Key Advantages of Viash Workflows

Modular, Reusable Components

  • Version-controlled
  • Independently tested and maintained
  • Avoids duplication

Explicit Data Flow

  • fromState/toState pattern for traceability and clarity

Flexible Resource Management

  • Assign CPU/memory tags per component

Input/Output Validation

  • Catches errors early

Independent Testing

  • Test components individually before full workflow run

Built-in Documentation

  • Self-documenting via --help

Simplified Maintenance

  • Update individual components without breaking the whole pipeline

Viash currently supports Nextflow, with plans to extend to Snakemake and others.


What’s Next?

In the final post, we’ll explore how to deploy Viash workflows on the cloud for large-scale data processing.

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

Contact Us