Visit us at booth 37 during Knowledge for Growth on 8 May in Antwerp.

Building Powerful Workflows with Viash Components

Encapsulate all knowledge, tools, scripts, and operations into a single, easily usable and integrable modular building block for pipeline integration solving all dependencies.
Keywords

bioinformatics workflows, modular workflow design, Viash components, reproducible workflows, workflow automation, pipeline development, workflow best practices, Nextflow integration, component-based pipelines, scalable bioinformatics workflows, workflow versioning, data pipeline testing, workflow maintainability, dockerized workflows, Viash workflow builder

Part 3: Building Powerful Workflows with Viash Components

TL;DR: Viash transforms how you build bioinformatics workflows by providing reusable, tested components that integrate seamlessly into your existing workflows and environment. This approach produces workflows that are more sustainable than traditional methods.

In our previous posts, we covered creating basic Viash components and making use of advanced, “batteries-included” features. Now let’s explore how to combine components into powerful workflows for complex bioinformatics analyses.

The Hidden Cost of Traditional Bioinformatics Workflows

Traditional bioinformatics workflows present numerous challenges that hamper productivity and reliability:

These issues collectively slow down bioinformatics research significantly. Viash workflows address these challenges by fundamentally changing how bioinformatics pipelines are constructed.

Viash: A Component-Based Approach to Workflow Development

Viash represents a paradigm shift in how bioinformatics workflows are constructed. By introducing a component-based approach with automated code generation, Viash addresses the fundamental challenges that plague traditional workflow development.

To follow along with the tutorial in this blogpost, you can clone the repo available at https://github.com/viash-hub/playground.

The Viash Workflow Architecture

Viash workflows combine individual components, both from our open-source catalogue open-source catalogue or custom in-house components. Creating a Viash Nextflow workflow involves two key files:

Let’s see how this works in a real-world example:

Creating a Mapping and QC Workflow

In the Viash config, the required components are declared as dependencies for our workflow. In this example, all the components are imported directly from the Viash catalogue.

src/mapping_and_qc/config.vsh.yaml (YAML)

name: mapping_and_qc
description: Run STAR and QC
arguments:

dependencies:
  - name: cutadapt
    repository: bb
  - name: falco
    repository: bb
  - name: multiqc
    repository: bb
  - name: star/star_align_reads
    repository: bb
  - name: samtools/samtools_stats
    repository: bb

repositories:
  - name: bb
    type: vsh
    repo: vsh/biobox
    tag: v0.3

runners:
  - type: nextflow

engines:
  - type: native
  - type: docker

Building and Running the Workflow

When we build the workflow, Viash will auto-generate a Nextflow workflow containing all the required glue code for parameter validation, data handling and container management.

Note how Viash transformed our simple VDSL3 workflow of ~90 lines into a full Nextflow workflow of ~3500 lines! This transformation is based on a deterministic, rule-based system - no AI or LLM involved - ensuring consistent and predictable workflow code following established patterns and best practices.

# Build the workflow
viash ns build -q mapping_and_qc

To run the workflow, let’s first install some test data by executing:

# Generate test data and a parameter file
./test_data.sh

Now we can run our Viash Nextflow workflow, making use of the parameter file.

# Run the workflow
nextflow run . \
  -profile docker \
  -main-script target/nextflow/mapping_and_qc/main.nf \
  -params-file params_file.yaml \
  --publish_dir workflow_test

The Viash-based Nextflow workflow comes with built-in documentation as well:

nextflow run target/nextflow/mapping_and_qc/main.nf --help

Key Advantages of Viash Workflows

Viash introduces several key advantages compared to traditional workflow methods that transform how bioinformatics workflows are built and maintained:

Modular, Reusable Components

Viash components are independent, self-contained modules that can be reused across multiple workflows. Each component is version-controlled, tested, and maintained separately, eliminating code duplication and reducing development time.

Explicit Data Flow

The fromState/toState pattern creates clear, traceable data connections between components. This explicit data handling reduces hidden errors and makes workflows easier to understand and debug.

Flexible Resource Management

Resource labels can be easily assigned to each component, allowing fine-grained control over computational requirements without complex configuration.

Automatic Input/Output Validation

Components automatically validate their inputs and outputs, catching errors early before they cascade through the analysis pipeline, significantly improving reliability.

Independent Testing

Components can be executed and tested in isolation, simplifying debugging and ensuring reliable operation when combined into larger workflows.

Built-in Documentation

Viash components and workflows come with built-in help and documentation, making it easier for team members to understand and use each component correctly.

Simplified Maintenance

Container versions are defined at the component level, meaning updates can be made to individual components without affecting the overall workflow structure. This dramatically simplifies version management and tool updates.

Currently, Viash supports the generation of Nextflow workflows, allowing you to leverage all the advantages we’ve discussed. Looking ahead, we’re planning to extend support to other popular workflow platforms like Snakemake, further expanding Viash’s flexibility and integration capabilities across the bioinformatics ecosystem.


What’s Next?

In the final post of this series, we’ll explore how to take your Viash workflows to cloud platforms for scalable execution on large datasets.

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

Contact Us