Mastering Bioinformatics Data Workflows

A recap of our live seminar — with full recording available on YouTube

On May 15, we hosted a free webinar that brought together deep technical insights and real world experience on how to build robust, sustainable and efficient bioinformatics workflows. The aim was clear: help teams scale their analyses, keep results reproducible and stay ahead of the accelerating pace of scientific discovery.

▶️ Watch the full recording on YouTube:
https://www.youtube.com/watch?v=UjnN0k1qTss

1. Dries De Mayer – Johnson & Johnson

Single Cell Analysis Simplified: Insights from Building a Better Bioinformatics Workflow

Dries De Mayer shared how the single cell analysis group at Johnson & Johnson matured its workflow strategy. The team began with highly flexible notebooks that supported rapid experimentation, but as data volumes increased and multimodal studies became standard, notebooks alone could not support scalability, consistency or long term reproducibility.

To address this, the group evolved step by step:

from notebooks
to scripts
to containerised components
to automated Nextflow pipelines

Each step added more structure and stability. Containers ensured tools reproducibility over time, while Nextflow enabled parallel execution, consistent environments and large scale processing across thousands of samples.

Dries highlighted that success didn’t come from selecting a single “best method” but from combining solid science with well designed, repeatable workflows. Today, the J&J team delivers results within a day, maintains complete traceability and can reliably rerun analyses even years later. The outcome: a workflow ecosystem that is fast, predictable and built for regulated environments.

2. Toni Verbeiren – Data Intuitive

Building Sustainable Bioinformatics Pipelines and Infrastructure

Toni Verbeiren followed with a deep dive into the tools and principles behind sustainable workflow engineering. Early collaborations with J&J revealed a key insight: workflow engines alone are not enough, teams also need a way to package, test, version and share components for pipelines to remain maintainable.

This is exactly what Viash and the OpenPipeline ecosystem are designed for.

Viash turns scripts and prototype code into production-ready building blocks that include:

clear inputs and outputs
a containerised runtime
metadata and versioning
automated unit and integration tests

These components integrate seamlessly into Nextflow workflows, forming modular pipelines that run identically on a laptop, HPC cluster, Kubernetes or cloud environment.
Furthermore, Viash Hub provides a structured catalogue where components can be published, discovered and reused, with support for both public and private instances.

This approach eliminates common workflow pain points such as dependency problems, unstructured scripts, environment drift, unclear parameters and difficulty extending existing pipelines. With versioned releases and containerised components, workflows evolve from prototypes to production-ready, validated tools.

Toni concluded with a clear message: sustainable bioinformatics requires platform thinking. With Viash and OpenPipeline, teams can build workflows that remain reproducible, maintainable and ready for real world scientific and clinical demands.

Website

GitHub

Publication

Webinar: Mastering Bioinformatics Data Workflows